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RB PATHWAY AND CHROMATIN REMODELING GENES THAT 
5 ANTAGONIZE LET-60 RAS SIGNALING 

Statement as to Federally Sponsored Research 

This work was supported in part by the National Institutes of Health 
(Grant No. GM24663). The government may have certain rights to this 
10 invention. 

Background of the Invention 

In general, the invention features methods and compositions useful in 
the treatment of a neoplasia. 

15 Retinoblastoma (Rb) family proteins are mammalian tumor suppressors 

that regulate cell proliferation. This pathway is conserved among a variety of 
species, including the nematode, Caenorhabditis elegans. LIN-35 Rb, which is 
the nematode C. elegans counterpart of mammalian Rb, is required for normal 
vulval development in C elegans. C. elegans vulval development also requires 

20 the activity of a conserved Ras signaling pathway. Mutations that disable 
let-60 Ras and other genes in this pathway result in a vulvaless (Vul) 
phenotype. Mutations that overactivate this pathway, for instance mutations 
that create the saipe G13E substitution found in oncogenic forms of human 
Ras, cause a multivulva (Muv) phenotype that is characterized by excessive 

25 induction of vulval cell fates, leading to worms having multiple vulvae. 

Lin-35 Rb is a synthetic multivulva synMuv gene. The synthetic 
multivulva (synMuv) genes antagonize the Ras signaling pathway that induces 
vulval development in the nematode C. elegans. The synMuv genes are 
grouped into two classes, A and B, such that a mutation in a gene of each class 

30 is required to produce a multivulva phenotype. The class B synMuv genes 
include homologs of other genes that function with Rb in transcriptional 
regulation. Many synMuv genes have been cloned and molecularly 
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characterized. Loss-of-function mutations in two functionally redundant 
pathways that are encoded by the class A and class B synthetic multivulva 
(synMuv) genes also cause a Muv phenotype. 

In addition to LIN-35 Rb, other proteins with class B synMuv activity 
5 are homologous to mammalian Rb-associated proteins. These other proteins 
include DPL-1 and EFL-1, homologs of DP and E2F transcription factors, LIN- 
53, a homolog of the Rb-binding proteins RbAp46 and RbAp48, HDA-1, a 
histone deacetylase homolog and HPL-2, a heterochromatin protein 1 homolog. 
The class B synMuv proteins act together to negatively regulate the 

10 transcription of genes that promote vulval development. Initially, DPL-1 and 
EFL-1 heterodimers bind DNA at specific regulatory sequences of vulval cell- 
fate determination genes. DNA-bound DPL-1 and EFL-1 heterodimers recruit 
LIN-35 Rb, which in turn recruits proteins that act to remodel chromatin. One 
of these proteins, HDA-1, is predicted to deacetylate lysines of nucleosoma] 

1 5 histones* Deacetylation of lysine residues is required for their subsequent 
methylation. HPL-2, another protein that may be recruited by LIN-35 Rb, is 
expected to act like other HP1 family proteins and bind, via its chromodomain, 
to methylated lysine residues of nucleosomal histones. 

Given the similarities that exist between C elegans and mammalian Rb 

20 and Ras pathways, C. elegans provides an efficient, inexpensive, and facile 

screening tool to identify novel clinical targets and chemotherapeutics useful in 
the treatment of neoplasia. 

Summary of the Invention 

25 The invention provides compositions useful in treating a neoplasia and 

methods for identifying chemotherapeutic agents. 

In one aspect, the invention features a method for identifying a 
compound that treats a neoplasia, the method involves (a) contacting a cell 
containing a mutation in a Class B synMuv gene selected from the group 

30 consisting of: mep-l 9 Un(n3628)> Hn(n4256) 9 and lin-65 and a second mutation 
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in a synthetic multivulval gene, or an ortholog thereof, with a candidate 
compound; and (b) detecting a phenotypic alteration in the contacted cell 
relative to a control cell; where a candidate compound that alters the phenotype 
of the contacted cell relative to the control cell is a compound that treats a 
5 neoplasia. In one embodiment, the cell is in a nematode. In another 
embodiment, the phenotypic alteration is an alteration in a multivulval 
phenotype. In another embodiment, the phenotypic alteration is an alteration in 
sterility. In another embodiment, the second mutation is in a synMuv class A 
gene. In another embodiment, the cell is an isolated mammalian cell. In 
10 another embodiment, the phenotypic alteration is a decrease in cell 
proliferation. 

In another aspect, the invention provides a method for identifying a 
candidate compound that treats a neoplasia, the method involves (a) providing 
a cell having a mutation in a Class B synMuv gene selected from the group 

15 consisting of mep-1, Un(n3628), Un(n4256), and lin-65 and having a second 
mutation in a synMuv nucleic acid or ortholog thereof; (b) contacting the cell 
with a candidate compound; and (c) detecting a decrease in proliferation of the 
cell contacted with the candidate compound relative to a control cell not 
contacted with the candidate compound, where a decrease in proliferation 

20 identifies the candidate compound as a candidate compound that treats a 
neoplasia. In one embodiment, the cell is in a nematode. In another 
embodiment, the decrease in proliferation is detected by detecting inhibition of 
a Muv phenotype. In another embodiment, the cell has a mutation in Dp, E2F, 
or histone deaceytlase. In another embodiment, the cell is an isolated 

25 mammalian cell. 

In another aspect, the invention provides a method of identifying a 
compound that treats a neoplasia, the method involves (a) providing a cell 
expressing a nucleic acid having at least 95% identity to a Class B synMuv 
gene selected from the group consisting of: mep-1, Un(n3628), Un(n4256), and 

30 /m-55; (b) contacting the cell with a candidate compound; and (c) monitoring 
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the expression of the nucleic acid, an alteration in the level of expression of the 
nucleic acid indicates that the candidate compound is a compound that treats a 
neoplasia. In one embodiment, the gene contains a reporter gene (e.g., lacZ, 
gfp, CAT, or luciferase). In another embodiment, expression is monitored by 
5 assaying protein level. In another embodiment, the expression is monitored by 
assaying nucleic acid level. In yet another embodiment, the cell is in a 
nematode. 

In another aspect, the invention features a method for identifying a 
candidate compound that treats a neoplasia, the method involves (a) providing 
. 10. _a cell.expressing a Class JB synMuv.gene selected from the group consisting of: 
mep-1, Un(n3628), Un(n4256), and lin-65\ (b) contacting the cell with a 
candidate compound; and (c) comparing the expression of the polypeptide in 
the cell contacted with the candidate compound to a control cell not contacted 
with the candidate compound, where an increase in the expression of the 

15 polypeptide identifies the candidate compound as a candidate compound that 
treats a neoplasia. In one embodiment, the cell is in a nematode. In another 
embodiment, the expression is monitored with an immunological assay. 

In another aspect, the invention features a method for identifying a 
candidate compound that treats a neoplasia, the method involves (a) providing 

20 a cell expressing a Class B synMuv polypeptide selected from the group 
consisting of: MEP-1, LIN(n3628), LIN(n4256), and LIN-65, the method 
involves; (b) contacting the cell with a candidate compound; and (c) 
comparing the biological activity of the polypeptide in the cell contacted with 
the candidate compound to a control cell not contacted with the candidate 

25 compound, where an increase in the biological activity of the polypeptide 
identifies the candidate compound as a candidate compound that treats a 
neoplasia. In another embodiment, the biological activity is monitored with an 
enzymatic assay. In another embodiment, the biological activity is monitored 
with an immunological assay. In yet another embodiment, the biological 

30 activity is monitored with a nematode bioassay. 
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In another aspect, the invention features a method of identifying a 
nucleic acid target of class B synMuv biological activity, the method involves 
(a) mutagenizing a C. elegans containing mutations in a Class B synMuv gene 
selected from the group consisting of: mep-l> Un(n3628), Hn(n4256), and lin-65 
5 and in a Class A synMuv gene; (b) allowing the C. elegans to reproduce; and 
(c) selecting a C. elegans containing a mutation that suppresses a synMuv 
phenotype; where the mutation identifies a nucleic acid target of class B 
synMuv biological activity. 

In another aspect, the invention features a method of identifying a 
. . 10 nucleic acid target. of class B synMuv biological .activity, the method involves 
(a) providing a microarray containing fragments of nematode nucleic acids; (b) 
contacting the microarray with detectably labeled nucleic acids derived from a 
nematode containing a mutation in a Class B synMuv gene selected from the 
group consisting of: mep-1, lin(n3628), Un(n4256), and lin-65 gene; (c) 

15 detecting an alteration in the expression of at least one nucleic acid of a C 
elegans containing a mutation in the Class B synMuv gene relative to the 
expression of the nucleic acid in a control nematode, where an alteration in the 
expression identifies the nucleic acid as a nucleic acid target of class B synMuv 
biological activity. In one embodiment, the C elegans further contains a 

20 mutation in a second synMuv gene. In another embodiment, the C. elegans 
further contains a mutation in a gene that results in a Vulvaless (Vul) 
phenotype. 

In another aspect, the invention features a method for identifying a 
nucleic acid that binds a synMuv class B polypeptide, the method involves (a) 

25 providing nucleic acids derived from a nematode cell; (b) crosslinking the 
nucleic acids'and their associated proteins to form a nucleic acid-protein 
complex; (c) contacting the nucleic acid-protein complex with an antibody 
against a polypeptide selected from the group consisting of MEP-1, 
LIN(n3628), LIN(n4256), and LIN-65; (d) purifying the nucleic acid-protein 

30 complex using an immunological method; and (e) isolating the nucleic acid, 
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where the isolated nucleic acid is a nucleic acid that binds a synMuv class B 
polypeptide. In one embodiment, the method further involves the following 
steps: (f) detectably labeling the nucleic acid of step (e); (g) contacting a 
microarray containing C. elegans nucleic acid fragments with the detectably 
5 labeled nucleic acid; and (h) detecting binding of the detectably labeled nucleic 
acid, where the binding identifies the nucleic acid as a nucleic acid that binds a 
synMuv class B polypeptide. 

In another aspect, the invention provides a vector containing a nucleic 
acid having at least 95% identity to a Class B synMuv gene selected from the 

1 0 .groupxonsisting of: mep-J 9 liji(n3628) 9 Ji7i(n4256),.md.Jin-j55^1iL one 

embodiment, the synMuv gene is 7nep-l (SEQ ID NO:2). In one embodiment, 
the synMuv gene contains a mutation selected from the group consisting of 
n3680, n3702 f and n3703. In other embodiments, the synMuv gene is 
Un(n3628) (SEQ ID NO:24), Un(n4256) (SEQ ID NO:26), or lin-65 (SEQ ID 

15 NO:28). 

In another aspect, the invention provides an isolated cell containing the 
vector of the previous aspect. 

In a related aspect, the invention provides a nematode containing the 
nucleic acid of the previous aspect. 
20 In another aspect, the invention provides a nematode containing a 

mutation in a Class B synMuv gene selected from the group consisting of: mep- 
7, Hn(n3628) 9 lin(n4256), and lin-65. In one embodiment, the mutation is a 
mep-1 mutation selected from the group consisting of n3680, n3702, and 
n3703. 

25 In another aspect, the invention features a purified nucleic acid 

containing a sequence that hybridizes under high stringency conditions to a 
Class B synMuv nucleic acid selected from the group consisting of: mep-1, 
lin(n3628), lin(n4256), and lin-65. 
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In another aspect, the invention features an antibody against a Class B 
synMuv polypeptide selected from the group consisting of: MEP-1, 
LIN(n3628), LIN(n4256), and LIN-65. 

In another aspect, the invention provides a method for identifying a 
5 compound that treats a condition characterized by inappropriate cell death, the 
method involves (a) contacting a nematode containing a mutation in a Class B 
synMuv gene selected from the group consisting of: mep-1, Un(n3628) 9 
Un(n4256), and lin-65 with a candidate compound; and (b) detecting a muv 
phenotype in the contacted nematode relative to a control nematode; where a 
1 0 candidate, compound that alters the phenotype of ihe contacted nematode . 
relative to the control nematode is a compound that treats a condition 
characterized by inappropriate cell death. In one embodiment, the cell is in a 
nematode. In another embodiment, the alteration is an alteration in a synMuv 
phenotype. 

15 In another aspect, the invention provides a method for identifying a 

compound that treats a neoplasia, the method involves (a) contacting a cell 
containing a mutation in a gene encoding KIAA1732 and a second mutation in 
a synMuv nucleic acid, or an ortholog thereof, with a candidate compound; (b) 
detecting a phenotypic alteration in the contacted cell relative to a control cell; 

20 where a candidate compound that alters the phenotype of the contacted cell 
relative to the control cell is a compound that treats a neoplasia. In one 
embodiment, the synthetic multivulval gene is a synMuv class A gene. In 
another embodiment, the cell is an isolated mammalian cell. In another 
embodiment, the phenotypic alteration is a decrease in cell proliferation. 

25 In another aspect, the invention features a method for identifying a 

candidate compound that treats a neoplasia, the method involves (a) providing 
a cell having a mutation in a nucleic acid encoding KIAA1732 and having a 
second mutation in a synMuv nucleic acid, or ortholog thereof; (b) contacting 
the cell with a candidate compound; and (c) detecting a decrease in 

30 proliferation of the cell contacted with the candidate compound relative to a 
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control cell not contacted with the candidate compound, where a decrease in 
proliferation identifies the candidate compound as a candidate compound that 
treats a neoplasia. In one embodiment, the cell has a mutation in Dp, E2F, or 
histone deaceytlase. In another embodiment, the cell is an isolated mammalian 
5 cell. 

In another aspect, the invention provides a method of identifying a 
compound that treats a neoplasia, the method involves (a) providing a cell 
expressing a nucleic acid having at least 95% identity to a nucleic acid that 
encodes KIAA1732; (b) contacting the cell with a candidate compound; and (c) 
.JO . jnonitoring theLexpression of Ihe nucleic, acid, an .alteration in the level of 
expression of the nucleic acid indicates that the candidate compound is a 
compound that treats a neoplasia. In one embodiment, the gene contains a 
reporter gene (e.g., lacZ, gfp, CAT, or luciferase). In another embodiment, 
expression is monitored by assaying protein level. In another embodiment, the 

15 expression is monitored by assaying nucleic acid level. In another 
embodiment, the cell is an isolated mammalian cell. 

In another aspect, the invention provides a method for identifying a 
candidate compound that treats a neoplasia, the method involves (a) providing 
a cell expressing a KIAA1732 polypeptide; (b) contacting the cell with a 

20 candidate compound; and (c) comparing the expression of the polypeptide in 
the cell contacted with the candidate compound to a control cell not contacted 
with the candidate compound, where an increase in the expression of the 
polypeptide identifies the candidate compound as a candidate compound that 
treats a neoplasia. In one embodiment, the cell is an isolated mammalian cell. 

25 In another embodiment, the expression is monitored with an immunological 
assay. 

In another aspect, the invention features a method for identifying a 
candidate compound that treats a neoplasia, the method involves (a) providing 
a cell expressing a KIAA1732 polypeptide; (b) contacting the cell with a 
30 candidate compound; and (c) comparing the biological activity of the 

8 
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polypeptide in the cell contacted with the candidate compound to a control cell 
not contacted with the candidate compound, where an increase in the biological 
activity of the polypeptide identifies the candidate compound as a candidate 
compound that treats a neoplasia. In one embodiment, the biological activity is 
5 monitored with an enzymatic assay. In another embodiment, the biological 
activity is monitored with an immunological assay. In another embodiment, 
the biological activity is methyl transferase activity. 

In another aspect, the invention features a method for identifying a 
nucleic acid that binds KIAA1732, the method involves (a) providing nucleic 

10 . - acids derivedirom a mammalian cell; (b)xrosslinldng the nucleicacids and 
their associated proteins to form a nucleic acid-protein complex; (c) contacting 
the nucleic acid-protein complex with an anti-KIAA1732 antibody; (d) 
purifying the nucleic acid-protein complex using an immunological method; 
and (e) isolating the nucleic acid, where the isolated nucleic acid is a nucleic 

15 acid that binds KIAA1732. In one embodiment, the method further involves 
the following steps: (f) detectably labeling the nucleic acid of step (e); (g) 
contacting a microarray containing human nucleic acid fragments with the 
detectably labeled nucleic acid; and (h) detecting binding of the detectably 
labeled nucleic acid, where the binding identifies the nucleic acid as a nucleic 

20 acid that binds KIAA1732. 

In another aspect, the invention provides a vector containing a nucleic 
acid having at least 95% identity to SEQ ID NO:36. 

In another aspect, the invention provides an isolated cell containing the 
vector of the previous aspect. 

25 In another aspect, the invention provides a method for identifying a 

compound that treats a neoplasia, the method involves (a) contacting a 
nematode containing a mutation in a Class C synMuv gene selected from the 
group consisting of hat-1, epc-1, and ssl-1 with a candidate compound; 
and (b) detecting an alterated phenotype in the contacted nematode relative to 

30 a control nematode; where a candidate compound that alters the phenotype of 
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the contacted nematode relative to the control nematode is a compound that 
treats a neoplasia. In one embodiment, the alteration is an alteration in vulval 
phenotype. In another embodiment, the alteration is an alteration in sterility. 
In another embodiment, the synMuv class C gene is trr-1. In another 
5 embodiment, the mutations are selected from the group consisting of n3630, 
n3637, n3704, n3708, n3709, and n3712. 

In another aspect, the invention provides a method for identifying a 
candidate compound that treats a neoplasia, the method involves (a) providing 
a cell having a mutation in a Class C synMuv gene selected from the group 
. 10 consisting.of trr^l, hat-1, epcrl, andjsskl ,and having a second.mutation in a 
synMuv nucleic acid or ortholog thereof; (b) contacting the cell with a 
candidate compound; and (c) detecting a decreased proliferation of the cell 
contacted with the candidate compound relative to a control cell not contacted 
with the candidate compound, where a decrease in proliferation identifies the 

15 candidate compound as a candidate compound that treats a neoplasia. In one 
embodiment,the cell is in a nematode. In another embodiment, the nematode 
displays an alteration in a synMuv phenotype. In another embodiment, the cell 
contains a mutation in a class A or class B synMuv gene. 

In another aspect, the invention provides a method for identifying a 

20 compound that treats a neoplasia, the method involves (a) contacting a 

nematode containing a mutation in a Class C synMuv gene selected from the 
group consisting of tir-1, hat-1, epc-1, and ssl-1 and a second mutation in a 
Class A synthetic multivulval gene with a candidate compound; and (b) 
detecting an altered phenotype in the contacted nematode relative to a control 

25 nematode; where a candidate compound that alters the phenotype of the 

contacted nematode relative to the control nematode is a compound that treats a 
neoplasia. In one embodiment,the alteration is an alteration in synMuv 
phenotype. In another embodiment, the alteration is an alteration in sterility. 
In another aspect, the invention provides a method for identifying a 

30 compound that treats a neoplasia, the method involves (a) contacting a 

10 
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nematode containing a mutation in a Class C synMuv gene selected from the 
group consisting of tir-1, hat-1, epc-1, and ssl-1 and a second mutation in a 
Class B synthetic multivulval gene with a candidate compound; (b) detecting 
an altered phenotype in the contacted nematode relative to a control nematode; 

5 where a candidate compound that alters the phenotype of the contacted 
nematode relative to the control nematode is a compound that treats a 
neoplasia. In another embodiment, the alteration is an alteration in synMuv 
phenotype. In another embodiment, the alteration is an alteration in sterility. 
In another aspect, the invention features a method for identifying a candidate 

10 . ..compound that treats a neoplasia, the method involves (a), providing asell 
having a mutation in a Class C synMuv gene selected from the group 
consisting of trr-1, hat-1, epc-1, and ssl-1 and having a second mutation in a 
synMuv gene or ortholog thereof; (b) contacting the cell with a candidate 
compound; and (c) detecting a decreased proliferation of the cell contacted 

15 with the candidate compound relative to a control cell not contacted with the 
candidate compound, where a decrease in proliferation identifies the candidate 
compound as a candidate compound that treats a neoplasia. In one 
embodiment, the cell is in a nematode. In another embodiment, the nematode 
displays an alteration in a synMuv phenotype. 

20 In another aspect, the invention provides a method of identifying a 

compound that treats a neoplasia, the method involves (a) providing a cell 
expressing a nucleic acid having at least 95% identity to a Class C synMuv 
nucleic acid selected from the group consisting of trr-1, hat-1, epc-1, and ssl-1; 
(b) contacting the cell with a candidate compound; and (c) monitoring the 

25 expression of the nucleic acid, an alteration in the level of expression of the 
nucleic acid indicates that the candidate compound is a compound that treats a 
neoplasia. In one embodiment, the gene contains a reporter gene. In another 
embodiment, the reporter gene contains lacZ, gfp, CAT, or luciferase. In 
another embodiment, the expression is monitored by assaying protein level. In 
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yet another embodiment, the expression is monitored by assaying nucleic acid 
level. In yet another embodiment, the nucleic acid is in a nematode. 

In another aspect, the invention provides a method for identifying a 
candidate compound that treats a neoplasia, the method involves (a) providing 

5 a cell expressing a a Class C synMuv polypeptide selected from the group 
consisting of TRR-1, HAT-1, EPC-1, and SSL-1 polypeptide; (b) contacting 
the cell with a candidate compound; and (c) comparing the expression of the 
polypeptide in the cell contacted with the candidate compound to a control cell 
not contacted with the candidate compound, where an increase in the 

10 -.expression of the polypeptide identifies the. candidate compound, as a candidate 
compound that treats a neoplasia. In one embodiment, the cell is in a 
nematode. In another embodiment,the expression is monitored with an 
immunological assay. 

In another aspect, the invention provides a method for identifying a 

15 candidate compound that treats a neoplasia, the method involves (a) providing 
a cell expressing a Class C synMuv polypeptide selected from the group 
consisting of TRR-1, HAT-1, EPC-1, and SSL-1; (b) contacting the cell 
with a candidate compound; and (c) comparing the biological activity of the 
polypeptide in the cell contacted with the candidate compound to a control cell 

20 not contacted with the candidate compound, where an increase in the biological 
activity of the polypeptide identifies the candidate compound as a candidate 
compound that treats a neoplasia. In one embodiment, the cell is in a 
nematode. In another embodiment, the biological activity is monitored with an 
enzymatic assay. In another embodiment, the biological activity is monitored 

25 with an immunological assay. 

In another aspect, the invention provides a method of identifying a 
nucleic acid target of a synMuv Class C polypeptide, the method involves (a) 
mutagenizing a C elegans containing a first mutation in a Class C synMuv 
gene selected from the group consisting of trr-1, hat-1, epc-1, and ssl-1 and a 

30 second mutation in a Class A or Class B synMuv gene; (b) allowing the C 

12 
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elegans to reproduce; (c) selecting a C. elegans containing a mutation that 
suppresses a synMuv phenotype; where the mutation identifies a nucleic acid 
target of a synMuv class C polypeptide. In one embodiment, the second 
mutation is in a class A synMuv gene. In another embodiment, the second 
5 mutation is in a Class B synMuv gene. 

In another aspect, the invention provides a method for identifying a a 
nucleic acid target of a synMuv Class C polypeptide, the method involves (a) 
providing a C. elegans containing a mutations in a Class C synMuv gene 
selected from the group consisting of frr-1, hat-], epc~l 9 and ssl-1; (b) growing 

1 0 . Jhe.C. elegans .onJhacteria expressing a. dsRNA; and.(c) identifying a dsRNA 
that suppresses a synMuv phenotype; where the dsRNA identifies a nucleic 
acid target of a synMuv class C polypeptide. 

In another aspect, the invention provides a method for identifying a a 
nucleic acid target of a synMuv class C polypeptide, the method involves (a) 

15 providing a C. elegans containing mutations in a Class C synMuv gene selected 
from the group consisting of trr-l 9 hat-l } epc-1, and ssl-1 and in a Class A or 
Class B synMuv gene; (b) growing the C. elegans on bacteria expressing a 
dsRNA; and (c) identifying a dsRNA that suppresses a synMuv phenotype; 
where the dsRNA identifies a nucleic acid target of a synMuv class C 

20 polypeptide. 

In another aspect, the invention features a method of identifying a 
nucleic acid whose expression is modulated by a synMuv class C polypeptide, 
the method involves (a) providing a microarray containing fragments of 
nematode nucleic acids; (b) contacting the microarray with detectably labeled 

25 nucleic acids derived from a nematode containing a mutation in a Class C 

synMuv gene selected from the group consisting of trr-1, hat-], epc-1, and ssl- 
1 gene; (c) detecting an alteration in the expression of at least one nucleic acid 
of a G elegans containing a mutation in the synMuv class C gene relative to 
the expression of the nucleic acid in a control nematode, where an alteration in 

30 the expression identifies the nucleic acid as a nucleic acid modulated by a 
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synMuv class C polypeptide. In one embodiment,the G elegans further 
contains a mutation in a synMuv A or synMuv B gene. In another 
embodiment, the C elegans further contains a mutation in a gene that results in 
a Vulvaless (Vul) phenotype. In another embodiment, the gene encodes LET- 
5 60. 

In another aspect, the invention provides a method for identifying a 
nucleic acid target of a synMuv class C polypeptide, the method involves (a) 
providing nucleic acids derived from a nematode cell; (b) crosslinking the 
nucleic acids and their associated proteins to form a nucleic acid-protein 
. 10 ..complex; .(c) contacting .the.micleic.acid-proteinxQmplex.with aa.antibody that 
binds a polypeptide selected from the group consisting of TRR-1, HAT-1, 
EPC-1, AND SSL-1; (d) purifying the nucleic acid-protein complex using an 
immunological method; and (e) isolating the nucleic acid, where the isolated 
nucleic acid is a nucleic acid that binds a synMuv class C polypeptide. In 

15 another embodiment, further containing the following steps: (f) detectably 
labeling the nucleic acid of step (e); (g) contacting the detectably labeled 
nucleic acid with a microarray containing G elegans nucleic acid fragments; 
and (h) detecting binding of the detectably labeled nucleic acid, where the 
binding identifies the nucleic acid as a nucleic acid target of a synMuv class C 

20 polypeptide. 

By "binds" is meant a compound or antibody which recognizes and 
binds a polypeptide of the invention, but which does not substantially recognize 
and bind other different molecules in a sample, for example, a biological 
sample, which naturally includes a polypeptide of the invention. 

25 By "cell" is meant a single-cellular organism, cell from a multi-cellular 

organism, or it may be a cell contained in a multi-cellular organism. 

By "derived from" is meant isolated from or having the sequence of a 
naturally-occurring sequence (e.g., a cDNA, genomic DNA, synthetic, or 
combination thereof). 



14 



WO 2004/024084 



PCTYUS2003/028626 



"Differentially expressed" means a difference in the expression level of 
a nucleic acid. This difference may be either an increase or a decrease in 
expression, when compared to control conditions. 

By "epc-l nucleic acid" is meant a synMuv Class C nucleic acid 
5 substantially identical to Yl 1 1B2A.1 1, which is identified by C elegans 
cosmid name and open reading frame number. 

By "EPC-1 polypeptide" is meant an amino acid sequence substantially 
identical to a polypeptide expressed by an epc-1 nucleic acid that that functions 
in vulval development and associates with a MYST family histone 

1 0 acetyltransferase. 

By "fragment" is meant a portion of a protein or nucleic acid that is 
substantially identical to a reference protein or nucleic acid (e.g., one of those 
■ listed in Tables 2 or 3), and retains at least 50% or 75%, more preferably 80%, 
90%, or 95%, or even 99% of the biological activity of the reference protein or 

1 5 nucleic acid using a nematode bioassay as described herein or a standard 
biochemical or enzymatic assay. 

By "hybridize" is meant pair to form a double-stranded molecule 
between complementary polynucleotide sequences (e.g., genes listed in Tables 
1-4 and 7), or portions thereof, under various conditions of stringency. (See, 

20 e.g., Wahl, G. M. and S. L. Berger (1987) Methods Enzymol. 152:399; Kimmel, 
A. R. (1987) Methods Enzymol 152:507) For example, stringent salt 
concentration will ordinarily be less than about 750 mM NaCl and 75 mM 
trisodium citrate, preferably less than about 500 mM NaCl and 50 mM 
trisodium citrate, and most preferably less than about 250 mM NaCl and 25 

25 mM trisodium citrate. Low stringency hybridization can be obtained in the 
absence of organic solvent, e.g., formamide, while high stringency 
hybridization can be obtained in the presence of at least about 35% formamide, 
and most preferably at least about 50% formamide. Stringent temperature 
conditions will ordinarily include temperatures of at least about 30°C, more 

30 preferably of at least about 37°C, and most preferably of at least about 42°C. 
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Varying additional parameters, such as hybridization time, the concentration of 
detergent, e.g., sodium dodecyl sulfate (SDS), and the inclusion or exclusion of 
carrier DNA, are well known to those skilled in the art. Various levels of 
stringency are accomplished by combining these various conditions as needed. 
5 In a preferred embodiment, hybridization will occur at 30°C in 750 mM NaCl, 
75 mM trisodium citrate, and 1% SDS. In a more preferred embodiment, 
hybridization will occur at 37°C in 500 mM NaCl, 50 mM trisodium citrate, 
1% SDS, 35% formamide, and 100 |ig/ml denatured salmon sperm DNA 
(ssDNA). In a most preferred embodiment, hybridization will occur at 42°C in 

10 .250 mMNaCl, 25 mM trisodium citrate, 1% JSDS, 50% formamide, .and 200 
\ig/m\ ssDNA. Useful variations on these conditions will be readily apparent to 
those skilled in the art. 

For most applications, washing steps that follow hybridization will also 
vary in stringency. Wash stringency conditions can be defined by salt 

15 concentration and by temperature. As above, wash stringency can be increased 
by decreasing salt concentration or by increasing temperature. For example, 
stringent salt concentration for the wash steps will preferably be less than about 
30 mM NaCl and 3 mM trisodium citrate, and most preferably less than about 
15 mM NaCl and 1.5 mM trisodium citrate. Stringent temperature conditions 

20 for the wash steps will ordinarily include a temperature of at least about 25°C, 
more preferably of at least about 42°C, and most preferably of at least about 
68°C. In a preferred embodiment, wash steps will occur at 25°C in 30 mM 
NaCl, 3 mM trisodium citrate, and 0.1% SDS. In a more preferred 
embodiment, wash steps will occur at 42°C in 15 mM NaCl, 1.5 mM trisodium 

25 citrate, and 0.1% SDS. In a most preferred embodiment, wash steps will occur 
at 68°C in 15 mM NaCl, 1.5 mM trisodium citrate, and 0.1% SDS. Additional 
variations on these conditions will be readily apparent to those skilled in the 
art. Hybridization techniques are well knowp to those skilled in the art and are 
described, for example, in Benton and Davis (Science 196:180, 1977); 

30 Grunstein and Hogness (Proc. Natl Acad. ScL, USA 72:3961, 1975); Ausubel 
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et al. (Current Protocols in Molecular Biology, Wiley Interscience, New York, 
2001); Berger and Kimmel (Guide to Molecular Cloning Techniques, 1987, 
Academic Press, New York); and Sambrook et al., Molecular Cloning: A 
Laboratory Manual, Cold Spring Harbor Laboratory Press, New York. 
5 By "hat-1 nucleic acid" is meant a a synMuv Class C nucleic acid 

substantially identical to VC5.4, which is identified by C. elegans cosmid name 
and open reading frame number. 

By "HAT-1 polypeptide" is meant an amino acid sequence substantially 
identical to a polypeptide expressed by a hat-1 nucleic acid that functions in 
10 . .vulvaLdevelopment and contains a chromodomain and an acetyltransferase 
catalytic domain. 

By u lin(n3628) nucleic acid" is meant a nucleic acid substantially 
identical to SEQ ID NO:24 that encodes a histone methyltransferase. 

By "LIN(n3628) polypeptide" is meant an amino acid sequence having 
15 substantial identity to a polypeptide expressed by a Un(n3628) nucleic acid that 
has histone methyltransferase activity and includes a SET domain. 

By "Un(n4256) nucleic acid" is meant a synMuv class B nucleic acid 
substantially identical to SEQ ID NO:27. 

By "LIN(n4256) polypeptide" is meant an amino acid sequence having 
20 substantial identity to a polypeptide expressed by a Hn(n4256) nucleic acid and 
having histone methyltransferase activity. 

By "lin-65 nucleic acid" is meant a synMuv class B nucleic acid 
substantially identical to SEQ ID NO:28. 

By "LIN-65 polypeptide" is meant an amino acid sequence having 
25 substantial identity to a polypeptide expressed by a lin-65 nucleic acid that is 
rich in acidic amino acids. 

By "immunological assay" is meant an assay that relies on an 
immunological reaction, for example, antibody binding to an antigen. 
Examples of immunological assays include ELISAs, Western blots, 
30 immunoprecipitations, and other assays known to the skilled artisan. 
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By "isolated polynucleotide" is meant a nucleic acid (e.g., a DNA) that 
is free of the genes which, in the naturally-occiarring genome of the organism 
from which the nucleic acid molecule of the invention is derived, flank the 
gene. The term therefore includes, for example, a recombinant DNA that is 

5 incorporated into a vector; into an autonomously replicating plasmid or virus; 
or into the genomic DNA of a prokaryote or eukaryote; or that exists as a 
separate molecule (for example, a cDNA or a genomic or cDNA fragment 
produced by PCR or restriction endonuclease digestion) independent of other 
sequences. In addition, the term includes an RNA molecule that is transcribed 

1 0 .irom a DNA moJecule, as .welLas. a recombinant DNA that.is.part of .aJiybrid 
gene encoding additional polypeptide sequence. 

By an "isolated polypeptide" is meant a polypeptide of the invention 
that has been separated from components that naturally accompany it. 
Typically, the polypeptide is isolated when it is at least 60%, by weight, free 

15 from the proteins and naturally-occurring organic molecules with which it is 
naturally associated. Preferably, the preparation is at least 75%, more 
preferably at least 90%, and most preferably at least 99%, by weight, a 
polypeptide of the invention. An isolated polypeptide of the invention may be 
obtained, for example, by extraction from a natural source, by expression of a 

20 recombinant nucleic acid encoding such a polypeptide; or by chemically 

synthesizing the protein. Purity can be measured by any appropriate method, 
for example, column chromatography, polyacrylamide gel electrophoresis, or 
by HPLC analysis. 

By "KIAAA1732 nucleic acid" is meant a human nucleic acid sequence 

25 having substantial identity to SEQ ID NO:30 and encoding a histone 
methyltransferase. 

By "KIAAA1732 polypeptide" is meant an amino acid sequence 
encoded by a nucleic acid substantially identical to SEQ ID NO:30, having 
histone methyltransferase activity, and including a SET domain. 
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By "mep-1 nucleic acid" is meant a a synMuv Class B nucleic acid 
substantially identical to M04B2.1, which is identified by C. elegans cosmid 
name and open reading frame number. 

By "MEP-1 polypeptide" is meant an amino acid sequence substantially 
5 identical to a polypeptide expressed by a mep-1 nucleic acid that functions in 
vulval development and contains multiple Zn finger motifs. 

By "multivulva" is meant having one vulva and one additional vulva- 
like structure. 

By "nucleic acid" is meant an oligomer or polymer of ribonucleic acid 
. 1 0 or deoxyribonucleic, acid, or. analog, thereof. . This term includes oligomers 
consisting of naturally occurring bases, sugars, and intersugar (backbone) 
linkages as well as oligomers having non-naturally occurring portions which 
function similarly. Such modified or substituted oligonucleotides are often 
preferred over native forms because of properties such as, for example, 

15 enhanced cellular uptake and increased stability in the presence of nucleases. 

Specific examples of some preferred nucleic acids envisioned for this 
invention may contain phosphorothioates, phosphotriesters, methyl 
phosphonates, short chain alkyl or cycloalkyl intersugar linkages or short chain 
heteroatomic or heterocyclic intersugar linkages. Most preferred are those with 

20 CH 2 -NH — O — CH 2 , CH 2 — N(CH 3 )— O— CH 2 , CH 2 — O— N(CH 3 )— CH 2 , . 
CH 2 — N(CH 3 )— N(CH 3 )— CH 2 and O— N(CH 3 >— CH 2 — CH 2 backbones 
(where phosphodiester is O — P — O — CH 2 ). Also preferred are 
oligonucleotides having morpholino backbone structures (Summerton, J:E. and 
Weller, D.D., U.S. Pat. No: 5,034,506). In other preferred embodiments, such 

25 as the protein-nucleic acid (PNA) backbone, the phosphodiester backbone of 
the oligonucleotide may be replaced with a polyamide backbone, the bases 
being bound directly or indirectly to the aza nitrogen atoms of the polyamide 
backbone (P.E. Nielsen et al. Science 199: 254, 1997). Other preferred 
oligonucleotides may contain alkyl and halogen-substituted sugar moieties 

30 comprising one of the following at the 2' position: OH, SH, SCH 3 , F, OCN, 
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0(CH 2 ) n NH 2 or 0(CH 2 ) n CH 3 , where n is from 1 to about 10; C, to d 0 lower 
alkyl, substituted lower alkyl, alkaryl or aralkyl; CI; Br; CN; CF 3 ; OCF 3 ; 0-, S- 
, or N-alkyl; 0-, S-, or N-alkenyl; SOCH 3 ; S0 2 CH 3 ; ON0 2 ; N0 2 ; N 3 ; NH 2 ; 
heterocycloalkyl; heterocycloalkaryl; aminoalkylamino; polyalkylamino; 
5 substituted silyl; an RNA cleaving group; a conjugate; a reporter group; an 
intercalator; a group for improving the pharmacokinetic properties of an 
oligonucleotide; or a group for improving the pharmacodynamic properties of 
an oligonucleotide and other substituents having similar properties. 
Oligonucleotides may also have sugar mimetics such as cyclobutyls in place of 

1 0 the pentofuranosyl group. 

Other preferred embodiments may include at least one modified base 
form. Some specific examples of such modified bases include 2- 
(amino)adenine, 2-(methylamino)adenine, 2-(imidazolylalkyl)adenine, 2- 
(aminoalklyamino)adenine, or other heterosubstituted alkyladenines. 

15 By "ortholog" is meant a polypeptide or nucleic acid molecule of an 

organism that is highly related to a reference protein, or nucleic acid sequence, 
from another organism. An ortholog is functionally related to the reference 
protein or nucleic acid sequence. In other words, the ortholog and its reference 
molecule would be expected to fulfill similar, if not equivalent, functional roles 

20 in their respective organisms. It is not required that an ortholog, when aligned 
with a reference sequence, have a particular degree of amino acid sequence 
identity to the reference sequence. A protein ortholog might share significant 
amino acid sequence identity over the entire length of the protein, for example, 
or, alternatively, might share significant amino acid sequence identity over only 

25 a single functionally important domain of the protein. Such functionally 
important domains may be defined by genetic mutations or by structure- 
function assays. Orthologs may be identified using methods provided herein. 
The functional role of an ortholog may be assayed using methods well known 
to the skilled artisan, and described herein. For example, function might be 

30 assayed in vivo or in vitro using a biochemical, immunological, or enzymatic 
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assay; transformation rescue, or in a nematode bioassay for the effect of gene 
inactivation on nematode phenotype (e.g., fertility), as described herein. 
Alternatively, bioassays may be carried out in tissue culture; function may also 
be assayed by gene inactivation (e.g., by RNAi, siRNA, or gene knockout), or 
5 gene over-expression, as well as by other methods. 

By "polypeptide" is meant any chain of amino acids, or analogs thereof, 
regardless of length or post-translational modification (for example, 
glycosylation or phosphorylation). 

By "positioned for expression" is meant that the polynucleotide of the 
1 0 invention (e.g.,. a DNA.molecuIe)is positioned. adjacen.Lto.aDNAjsequence 
that directs transcription and translation of the sequence (i.e., facilitates the 
production of, for example, a recombinant polypeptide of the invention, or an 
RNA molecule). 

By "purified antibody" is meant an antibody that is at least 60%, by 
15 weight, free from proteins and naturally-occurring organic molecules with 
which it is naturally associated. Preferably, the preparation is at least 75%, 
more preferably 90%, and most preferably at least 99%, by weight, antibody. 
A purified antibody of the invention may be obtained, for example, by affinity 
chromatography using a recombinantly-produced polypeptide of the invention 
20 and standard techniques. 

By "specifically binds" is meant a compound or antibody that 
recognizes and binds a polypeptide of the invention, but which does not 
substantially recognize and bind other molecules in a sample, for example, a 
biological sample, which naturally includes a polypeptide of the invention. 
25 By "ssl-1 nucleic acid" is meant a nucleic acid substantially identical to 

SEQ ID NO:21, which is identified by C. elegans cosmid name and open 
reading frame number. 

By "SSL-1 polypeptide" is meant an amino acid sequence substantially 
identical to a polypeptide expressed by a ssl-1 nucleic acid that functions in 
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embryonic development and has homology to p400 a SWI2/SNF2 family 
member having ATPase activity . 

By "synthetic multivulva (synMuv) gene' 5 is meant a gene that when 
mutated, interacts synergistically with a second synMuv gene to cause a 
5 synthetic multivulval phenotype. For example, trr-1 and mep-1 are synMuv 
genes because worms containing a mutation in trr-1 or mep-1, and also having 
a mutation in lin-15A (e.g., Un-15A(n767)) display a synthetic multivulval 
phenotype. 

By "trr-1 nucleic acid" is meant a nucleic acid substantially identical to 
10 SEQ TP NO: 12, which is identified by C. elegans cosmid name and open 

reading frame number. Nucleic acid and polypeptide sequence information is 
available at wormbase (www.wormbase.org), a central repository of data on C. 
elegans. 

By "TRR-1 polypeptide" is meant an amino acid sequence substantially 
15 identical to a polypeptide expressed by a trr-1 nucleic acid that functions in 
transcriptional regulation and vulval development. 

"Therapeutic compound" means a substance that has the potential of 
affecting the function of an organism. Such a compound may be, for example, 
a naturally occurring, semi-synthetic, or synthetic agent. For example, the test 
20 compound may be a drug that targets a specific function of an organism. A test 
compound may also be an antibiotic or a nutrient. A therapeutic compound 
may decrease, suppress, attenuate, diminish, arrest, or stabilize the 
development or progression of disease, disorder, or infection in a eukaryotic 
host organism. 

25 The invention provides a number of targets that are useful for the 

development of highly specific drugs to treat neoplasia or a disorder 
characterized by the misregulation of the cell cycle (e.g., a hyperproliferative 
disorder). In addition, the methods of the invention provide a facile means to 
identify therapies that are safe for use in eukaryotic host organisms (i.e., 

30 compounds that do not adversely affect the normal development, physiology, 
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or fertility of the organism). In addition, the methods of the invention provide 
a route for analyzing virtually any number of compounds for effects on cell 
proliferation and cell cycle regulation with inexpensively and with high- 
volume throughput in a living animal. 
5 Other features and advantages of the invention will be apparent from the 

detailed description, and from the claims. 

The invention provides methods and compositions useful in treating a 
neoplasia and in identifying chemotherapeutic agents. Other features and 
advantages of the invention will be apparent from the detailed description, and 
10. .from the claims.. 

Brief Description of the Drawings 

Figure 1A is a schematic diagram the location of mep-1 on the LGIV 
physical map in between sem-3 and dpy-20. The mep-1 rescuing cosmid 
1 5 M04B2 is shown in bold. 

Figure IB shows the predicted MEP-1 protein (SEQ ID NO:l). Zinc 
finger motifs are shaded, and the positions of mep-1 mutations are indicated by 
arrowheads. 

Figure 2 shows the genomic sequence of mep-1 (SEQ ID NO:2). The 
20 start and stop codons are indicated by highlighting. 

Figure 3 shows the nucleic acid sequence of the mep-1 open reading 
frame (SEQIDNO:3). 

Figure 4 shows the deduced amino acid sequence of MEP-1. 
Figures 5A and 5B are bar graphs showing that tir-1 single mutants are 
25 defective in P(8).p fate specification. Induction of individual P(3-8).p cells was 
scored in wild-type animals (Figure 5 A) and tir-l(nS712) mutants (Figure 5B). 
Certain cells in trr-1 mutants adopted hybrid fates in which one of two Pn.p 
daughters divided like daughters of induced Pn.p cells and the other daughter 
remained undivided as in uninduced Pn.p cells. Ectopic induction in single 
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mutant animals containing each of the other five trr-1 mutations was similarly 
restricted to P8.p. 

Figure 6 is a har graph showing that, trr-1 and class B synMuv 
mutations are synthetically defective in P8.p cell-fate specification. P8.p 
5 induction was scored. We recognized trr-1 homozygous mutants as non-Gfp 
progeny of trr-1/ mini [dpy-1 0(el 28) mis 14] heterozygous parents, lin- 
15B(n744), lin-35(n745), Un-36(n766) and Un-37(n758) are the strongest 
mutations of their corresponding genes. Strains homozygous for these 
mutations are viable, trr-1; synmuvB double mutant strains with these 
. 10 . jmutations were deriyed from parents that. were .homozygous.for.the synmuvB 
mutation and hence lacked maternal and zygotic function of the class B 
synMuv gene in question. The dpl-1 (n33 1 6) null mutation causes sterility. We 
combined dpl-l(RNAi) with the dpl-l(n3316) mutation to generate mutants that 
lacked both maternal and zygotic dpl-1 activity and recognized these mutants 
15 as non-Gfp progeny of dpl-l(n3316) trr-1/ mini [dpy-1 0(el 28) mlsl4] 
heterozygous parents that were injected with dpl-1 dsRNA. 

Figure 7A shows the tir-1 gene structure as derived from cDNA and 
genomic sequences. Shaded boxes indicate coding sequence and open boxes 
indicate 5' and 3' untranslated regions. Predicted translation initiation and 
20 termination codons and the poly(A) tail are indicated. Positions of alternative 
splicing are indicated by asterisks. In all cases, the use of alternative splice 
acceptors creates small differences in the trr-1 coding sequence: alternative 
splicings of the fourth (ag/TTTCAGAC (SEQ ID NO:4) versus agtttcag/AC 
(SEQ ID NO:5)), fifth (ag/AATCTTCAGTC (SEQ ID NO:6) versus 
25 (agaatcttcag/CC (SEQ ID NO:7)), eleventh (ag/AACTTTAAGAT (SEQ ID 
NO: 8) versus agaactttaag/AT (SEQ ID NO:9) and twelfth introns 
(ag/TTGCAGAA (SEQ ID NO: 1 0) versus agttgcag/AA (SEQ ID NO: 1 1)) 
differ by either six or nine nucleotides. 

Figure 7B is a schematic diagram of the TRR-1 protein. The positions 
30 of substitutions caused by TRR-1 mutations are indicated above. TRR-1 is 
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similar to mammalian TRRAP and yeast Tralp thoughout the lengths of the 
proteins. Domains of similarity (e.g., FAT and ATM/PI-3 kinase-like domains) 
that these three proteins share are indicated. 

Figure 8 shows the genomic nucleic acid sequence of trr-1 (SEQ ID 
5 NO: 12). The start and stop codons are indicated by highlighting. 

Figure 9 shows the nucleic acid sequence of the tir-l open reading 
frame (SEQ ID NO: 13). 

Figure 10 shows the deduced amino acid sequence of TRR-1 (SEQ ID 
NO: 14). 

TO Figure .1 1 A is a schematic diagram showing thz Jiat-1 gene. structure as 

derived from cDNA and genomic sequences. Shaded boxes indicate coding 
sequence and open boxes indicate 5' and 3' untranslated regions. Predicted 
translation initiation and termination codons and the poly(A) tail are shown. 
Figure 1 IB is a schematic diagram of the HAT-1 protein. HAT-1 is 

15 similar to MYST family acetyltransferases, all of which contain a MOZ/SAS 
acetyltransferase domain and some of which contain a chromodomain. 
Nematodes expressing the hat-l(n4075) deletion are expected to produce only 
the first 35 amino acids of the wild-type HAT-1 protein and additional 
frameshifted amino acids prior to truncation. 

20 Figure 1 1C is a bar graph showing that hat-1 single mutants were 

defective in P(8).p fate specification. Induction of individual P(3-8).p cells was 
scored in wild-type animals (left) and hat-1 (n4075) mutants (right), hat-1 
homozygous mutants were recognized as non-Unc progeny of +MTln754; hat- 
l(n4075)/nTln754 heterozygous parents. 

25 Figure 1 ID is a bar graph showing that hat-1 is synthetically defective 

in P8.p cell-fate specification with the class B synMuv mutation Un-15B(n744). 
P8.p induction was scored as described below, hat-1 homozygous mutants 
were recognized as in (C). 
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Figure 12 shows the genomic nucleic acid sequence of hat-1 (SEQ ID 
NO: 15). The start and stop codons are indicated by highlighting. 

Figure 13 shows the nucleic acid sequence of the hat-1 open reading 
frame (SEQ ID NO: 16). 
5 Figure 14 shows the deduced amino acid sequence of HAT-1 (SEQ IP 

NO:17). 

Figure 15A is a schematic diagram showing epc-1 and ssl-1 gene 
structures and deletion mutations. The gene structure of epc-1 was derived by 
comparing cDNA and genomic sequences. 

10 . Figure 15B is a schematic showing the ssl-1 gene structure and deletion 

mutation. The gene structure of ssl-1 is partially derived from comparison of 
cDNA and genomic sequences (SL1 splice leader, 5' untranslated region, exons 
1-12 and the beginning of exon 13) and partially predicted solely from genomic 
sequence (the end of exon 13). As we do not have cDNA clones representing 

15 the 3' end of ssl-1 1 we are unable to reliably assign a 3' untranslated region and 
poly(A) tail. Filled boxes indicate coding sequence and open boxes indicate 5' 
and 3' untranslated regions. SL1 splice leaders, predicted translation start and 
stop codons and poly(A) tail are shown. The regions of genomic sequence 
removed by the epc-1 (n4076) and ssl-1 (n4077) deletions are indicated. 

20 Figure 16 shows the genomic nucleic acid sequence of epc-1 (SEQ ID 

NO: 18). 

Figure 17 shows the nucleic acid sequence of the epc-1 open reading 
frame (SEQ ID NO: 19). 

Figure 18 shows the deduced amino acid sequence of EPC-1 (SEQ ID 
25 NO:20). 

Figure 19 shows the genomic nucleic acid sequence of ssl-1 (SEQ ID 
NO:21) and the deduced amino acid sequence. 

Figure 20A shows the exon boundaries of the ssl-1 genomic nucleic acid 
sequence. 
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Figure 20B shows the cDNA nucleic acid sequence of ssl-1 (SEQ ID 
NO:22). 

Figure 21 shows the amino acid sequence of SSL-1 (SEQ ID NO:23). 
Figures 22A and 22B are schematic diagrams showing two models of 
5 TRR-l/HAT-l/EPC-1 function with respect to class B synMuv proteins 

Figure 22A is a schematic diagram showing that a TRR-l/HAT-l/EPC- 
1 complex and the class B synMuv proteins act on different targets and 
differentially regulate transcription. In this model a putative TRR-l/HAT- 
l/EPC-1 complex acts on targets that are different from those of a putative 
1 0 class B..synMuv protein complex. A TRR-1/HAT,-: 1/EPC-l .complex may . . 
promote transcription of genes that negatively regulate vulval development, 
whereas class B synMuv proteins may repress transcription of genes that 
promote vulval development. 

Figure 22B is a schematic diagram showing a second model In this 
15 second model, a TRR-l/HAT-l/EPC-1 complex acts on the same targets as do 
the class B synMuv proteins. Together these two putative protein complexes 
may specify an acetylation pattern on histones that is required for efficient 
silencing of genes that promote vulval development. A TRR-l/HAT-l/EPC-1 
complex may act through DPL-1 and EFL-1, although genetic interactions 
20 suggest that not all TRR-l/HAT-l/EPC-1 complex activity goes through DPL- 
1 andEFL-1. 

Figure 23 shows the genomic sequence of Hn(n3628) including 1 kb of 
upstream and downstream genomic sequences (SEQ ID NO:24). The exon 
boundaries are also defined. 
25 Figure 24 shows the amino acid sequence of LIN(n3628) (SEQ ID 

NO:25). 

Figure 25 shows the genomic sequence of Hn(n4256) (SEQ ID NO:26). 
The exon boundaries are also defined. 

Figure 26 shows the amino acid sequence of LIN(n4256) (SEQ ID 
30 NO:27). 
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Figure 27 shows the genomic sequence of lin-65 (SEQ ID NO:28). The 
exon boundaries are also defined. 

Figure 28 shows the amino acid sequence of LIN-65 (SEQ ID NO:29). 
The exon boundaries are also defined. 
5 Figure 29 shows the mRNA sequence that encodes the LIN(n3628) 

human ortholog, KIAA1732. 

Figure 30 shows the amino acid sequence of KIAA1732 (SEQ ID 
NO:35). 

Figure 3 1 defines the domains of LIN(n3628), including the SET 
10 catalytic-domain. 

Figure 33 defines the domains of KIAA1732, including the SET 
catalytic domain. 

Description of the Invention 

As reported in more detail below, we have identified new components of 
the Rb pathway that function in chromatin remodeling and antagonize Ras 
signaling, and methods for using such components for the identification of 
chemotherapeutics and the identification of new clinical targets for the 
treatment of neoplasia. 
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Example I 
Isolation of new synMuv mutants 

A variety of genetic studies revealed that sterility is often associated 
with a severe reduction of class B synMuv gene function. For example, in a 
5 genetic screen for alleles that did not complement the synMuv phenotype of 
lin-9(nll2), (Ferguson et al., Genetics 123: 109-21, 1989) recovered the alleles 
Un-9(n942) and Hn-9(n943), which caused sterility when homozygous. In 
another example, we performed gene dosage studies and observed that, in 
comparison to the wild-type lin-52(n771)/Dfzn<i dpl-l(n2994)/Df 
. 10 Jieterozygotes had .markedly_ceduced.brood. sizes. Inaddition,.deletion.. 

mutations of synMuv genes that showed recessive sterility were recovered by 
reverse genetic approaches {e.g. alleles of lin-53 (Lu 1999), lin-54, and dpl-1 
(Ceol et al., Mol Cell 7: 461-73, 2001). 

Previous genetic screens for synMuv mutants (Ferguson et al., Genetics 

15 1 23 : 1 09-2 1,1989) were performed before a link between loss of synMuv gene 
function and sterility was well established. These screens required that isolates 
be fertile and viable in order to recover mutant alleles. In addition to failing to 
recover recessive sterile mutations of the genes described above, these screens 
failed to recover mutations of the class B synMuv genes efl-1 and let-418, both 

20 of which can mutate to a sterile phenotype (Von Zelewsky et al., Development 
127: 5277-84, 2000; Ceol et al., Mol Cell 7: 461-73, 2001). Given this failure, 
we undertook a genetic screen to identify additional synMuv genes that would 
allow the recovery of homozygous sterile mutations through phenotypically 
wild-type heterozygous siblings. 

25 To screen for new synMuv mutants, we examined the F 2 progeny of 

individually plated V\ animals after EMS mutagenesis of Hn-15A(n767) 
mutants. This screen represented 6760 haploid genomes examined for 
mutations that either alone or in combination with Un-15A(n767) showed a 
recessive Muv phenotype. Using this strategy we identified 95 Muv mutations, 

30 24 of which were maintained as heterozygotes due to recessive sterility that 
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cosegregated with the Muv phenotype. Three mutations caused a Muv 
phenotype in the absence of lin-1 5A(n767) and were found to affect lin-1 and 
lin-31, both of which function downstream of let- 60 Ras in vulval induction 
(Ferguson et al, Nature 326:259-67, 1987). These mutations, lin-1 (n3443) 9 

5 lin-1 (n3 522), and lin-31(n3440) were not characterized further. Additionally, 
we recovered 29 mutations that, together with lin-1 5A(n767), caused a weakly 
penetrant (< 30%) Muv phenotype. The remaining 63 mutations were assigned 
to 21 complementation groups, which include the previously known genes 
ark-1, dpl-1, efl-1, gap-1, let-418, lin-9, lin-1 3, lin-1 5B, lin-35, lin-36, lin-52, 

10 Jin-53, lin-6J, and sli-1, and the new genes lin(n3441) 9 lin(n3542), Hn(n3628), 
Hn(n3681), Hn(n3707), mep-1, and tir-1. 

Phenotypes of new mutants 

We characterized the penetrance of the Muv phenotype for each strain at 
15 15°C and 20°C. The results of this study are described in Table 1. 
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Table 1 Penetrance of Muv phenotype (n) 



Genotype 


15° C 


20° C 


Additional phenotypes 


ark-l(n3524) Un-15A(n767) 


0(251) 


80 (171) 




ark-l(n3701); lin- 








15A(n767) 


12 (190) 


95 (160) 




dpl-l(n3643); lin- 








15A(n767) 


99 (154) 


100 (252) 




efl-l(n3639); Un-15A(n767) 


93 (74) 


100(78) 


Ste 


gap-l(n3535) lin- 








15A(n767) 


1.4 (143) 


50 (236) 




let-418(n3536); lin- 








15A(n767) 


0 (201) 


55 (183) 


hsSte 


let-418(n3626); lin- 








15A(n767) 


1.6(62) 


97 (76) 


Ste 


let-418(n3629); lin- 








15A(n767) 


0(52) 


86 (58) 


Ste 


let-418(n3634); lin- 








15A(n767) 


0(87) 


92 (48) 


Ste 


let-418(n3635); lin- 








15A(n767) 


0(76) 


71 (70) 


Ste 


let-418(n3636); lin- 








J5A(n767) 


0(77) 


92 (78) 


Ste 


let-418(n3719); lin- 








15A(n767) 


0 (101) 


100(60) 


Ste 


Un-9(n3631); Un-15A(n767) 


100(42) 


100 (72) 


Ste 


Un-9(n3675); Un-15A(n767) 


43 (166) 


100(105) 




Un-9(n3767); Un-15A(n767) 


100(67) 


100(56) 


Ste 


Un-13(n3642); lin- 








15A(n767) 


3.3 (60) 


100 (63) 


Ste 


lin-13(n3673); lin- 








15A(n767) 


61 (145) 


97 (129) 




Hn-13(n3674); lin- 

till xjiiP^JyJi 1 / i *«** 








15A(n767) 


78 (131) 


100 (191) 


hsSte 


Hn-13(n3726); lin- 








15A(n767) 


31 (225) 


99 (149) 


hs Ste 
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Genotype 15° C 20° C Additional phenotypes 

Un-15B(n3436) lin- 

15A(n767) 100(193) 100(212) 

Hn-15B(n3676) lin- 

15A(n767) 18 (167) 72 (130) 

Un-15B(n3677) lin- 

15A(n767) 99(111) 100(122) 

Un-15B(n37U) lin- 

15A(n767) 100(186) 100(156) 

Uti'15B(n3760) Itn- 

J5A(n767) 32(171) 100(150) 

Un-15B(n3762) lin- 

15A(n767) 63(113) 97(116) 

Hn-J5B(n3764) lin- 

15A(n767) 96(232) 100(199) 

Un-15B(n3766) lin- 

15A(n767) 55(132) 100(173) 

lin-15B(n3768) lin- 

15A(n767) 80(159) 100(302) 

Un-15B(n3772) lin- 

15A(n767) 100(220) 100(191) 

Un-35(n3438); lin- 

15A(n767) 100(153) 100(126) partial Ste at 20°Q Rup 

lin-35(n3763); lin- 

15A(n767) 100(108) 100(160) partial Ste at 20°C, Rup 

Un-36(n3671); lin- 

15A(n767) 65 (191) 100 (151) 

Un-36(n3672); lin- 

15A(n767) 98(198) 100(178) 

Un-36(n3765); lin- 

15A(n767) 0(184) 37(202) 

Hn-52(n3718); lin- 

15A(n767) 100(41) 100(82) Ste 

Un-53(n3448); lin- 

15A(n767) 67(130) 100(211) partial Ste at 20°C 
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Genotype 



15° C 



20° C 



lin-53(n3521); lin- 

15A(n767) 100(34) 100(125) 
lin-53(n3622); lin- 

15A(n767) 85(61) 100(66) 
Un-53(n3623); lin- 

15A(n767) 24(55) 100(51) 
Hn-61(n3442); lin- 

15A(n767) 22(130) 100(152) 
Hn-61(n3446); lin- 

15A(n767) 36(124) 99(191) 
lin-61 (n3447); lin- 

15A(n767) 11 (121) 87(207) 
lin-61 (n3624); lin- 

15A(n767) 0(152) 89(231) 
lin-61 (n3736); lin- 

15A(n767) 0(193) 100(201) 

n3441; Hn-15A(n767) 80 (165) 99 (195) 

n3541; lin-15A(n767) 79 (242) 98 (137) 

n3543; Un-15A(n767) 85 (177) 100(121) 

n3628;lin-15A(n767) 2.9(103) 84(188) 

n3681; lin-1 5A(n767) 0 (214) 72 (192) 

n3542 Un-15A(n767) 0 (127) 35 (218) 

n3707 lin-1 5A(n767) 3.8(80) 77(26) 
mep-l(n3680); lin- 

15A(n767) 4.9(122) 97(105) 
mep-l(n3702); lin- 

15A(n767) 30(61) 100(141) 
mep-1 (n3703); lin- 

15A(n767) 25(72) 100(107) 

sli-l(n3538) lin-1 5A(n767) 4.3(138) 90(173) 

sli-l(n3544) lin-15A(n767) 4.6 (153) 80 (265) 

sli-l(n3683) lin-1 5A(n7 67) 5.0(80) 88(148) 

trr-l(n3630); lin-1 5A(n767) 3.1 (131) 85(212) 

trr-l(n3637); lin-1 5A(n767) 1.1 (92) 80(200) 



Additional phenotypes 

partial Ste at 20°C 

Ste 

Ste 



hsSte 

Ste 

Ste 

cs embryonic lethality 
cs embryonic lethality 
Ste, Gro 
Ste, Gro 
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Genotype 


15° C 


20° C 


Additional phenotypes 


trr-l(n3704); Un-15A(n767) 


3.1 (96) 


79 (244) 


Ste, Gro 


trr-l(n3708); Un-15A(n767) 


2.0(151) 


84 (228) 


Ste, Gro 


trr-l(n3709); lin-1 5A(n767) 


1.0 (97) 


77(154) 


Ste, Gro 


trr-l(n3712); Un-15A(n767) 


5.8 (121) 


77 (192) 


Ste, Gro 



Ste: sterile; Gro: growth rate abnormal; Rup: rupture at the vulva; cs: cold sensitive; hs: heat sensitive. 

The penetrance of the Muv phenotype was determined after growing 



synMuv mutant strains at the indicated temperature for two or more 
generations. For most strains in which a fully penetrant sterile phenotype was 
5 associated with the Muv phenotype, we scored the penetrance of the Muv 
phenotype by examining sterile progeny of heterozygous mutant parents. For 
trr-1 mutant strains, we scored the penetrance of the Muv phenotype by 
examining non-Gfp progeny of trr-1 / mini [dpy-10(el28)mlsl4] ; lin- 
15A(n767) heterozygous parents. All strains were backcrossed to tin- 

10 15A(n767) twice prior to phenotypic characterization. In addition to the 
phenotypes described above, many of the strains exhibited heat sensitive 
inviability due to frequent rupture, sterility, and/or general sickness. 

The penetrance at 25°C is not shown because all strains had a highly 
penetrant (>90%) Muv phenotype at this temperature. Since a heat-sensitive 

15 Muv phenotype is characteristic of most synMuv strains, including those with 
null mutations in synMuv genes (Ferguson et al., Genetics 123: 109-21, 1989), 
• it is likely that many synMuv mutations are not particularly temperature 
sensitive, but rather that the synMuv genes regulate a temperature sensitive 
process. 

20 A subset of our synMuv strains also exhibited a sterile phenotype. In 

these strains, the sterile phenotype cosegregated with the Muv phenotype 
during backcrosses and two- and three-factor mapping experiments. For those 
mutations tested, we found that our new mutations did not complement the 
sterile phenotypes caused by previously isolated, allelic synMuv mutations. 

25 These observations suggest that the sterile and Muv phenotypes of these strains 
were caused by the same mutation. 
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We observed an unusual aspect to the sterility of one of our strains. We 
examined the mep-l(n3680); Un-15A(n767) strain and found that its sterile 
phenotype showed maternal-effect rescue. When derived from heterozygous 
parents, the sterility of the mep-l(n3680); Un-15A(n767) animals was 3.2% 

5 penetrant (n=62), but was 55% penetrant (n=69) when these animals were 
derived from homozygous parents. Mutations that affect the Mes (Mes, 
maternal-effect sterility) genes also show maternal-effect rescue of sterility 
(Capowski et al., Genetics 129: 1061-72, 1991). Some Mes genes encode 
homologs of Drosophila polycomb group proteins and are proposed to function 

10 in X chromosome transcriptional silencing in the germline (Holdeman et al., 
Development 125: 2457-67, 1998; Korf et al., Development 125: 2469-78, 
1998; Fong et al., Science 296: 2235-8, 2002). A functional relationship 
between the synMuv and Mes genes has not been previously reported. 

15 New synMuv genes 

Using two-factor crosses and sex chromosome transmission tests, we 
mapped the new mutations to linkage groups (Table 2). 
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Table 2 Chromosomal linkages of new synMuv mutations 
A, Autosomal mutations 



New mutation 


Mutation used for 
selection of homozygous 
F 2 hermaphrodites 


Genotype of selected F 2 
hermaphrodites withrespect to 

linLrorl iincplpr'tpH imitation 
cne iinKcUj uiiacici.icu inuiduuu 


ark-l(n3524) 


apy-2v{eilol) IV 


9/1 Q nrk-Wn3524)/+ 


ark-l(n3701) 


a? K'J(riJ/UJJ 


1 /1 4 Anv-lftM 282)1+ IV 


dpl-l(n3643) 


dpi- 1(113043) 


0/90 rnl-6(el87)/+ II 
yji A*\j rut -u\ci u / jt 1 


efl-l(n3639) 


r ol-4 (sco) v 


4/90 pfl-1fn3639)/+ 


let-418(n3536) 


let-4Jo(n3j30) 


4/91 rnl-4fcc8)/+ V 


let-418(n3626) 


rol-4(sco) v 


0/10 1t>t-41Rfn362fi)/+ 


let-418(n3629) 


_ 1 A/ n/% 0\ If 

rol-4(sco) V 




let-418(n3634) 


ro 1-4 (sco) V 


Lily i&i-m 0[fijuj*ty/' 


let-418(n3635) 


rol-4(sc8) V 


^/90 7i?/ AIR/nlfil ^)/4- 


let-418(n3636) 


roU4(sc8) V 


^/90 7i?/ z/7>?/m : ?^ : ?/^)/+ 
Dl JA) l€l-HiO[nD\JJV//* 


let-418(n3719) 


rol-4(sc8) V 




Un-9(n363J) 


unc-32(el89) III 


0790 7»» 0/w 3/> R 7 )/4- 
U/ZU lln-y{rlJOJJJ/^ 


Hn-9(n3675) 


hn-9(n3675) 


n/99 t inn llfaJXQ)/!- TIT 


hn-9(n3767) 


Lin-9(n37o7) 


0/16 wcrP9//+ 777 


Un-13(n3642) 


unc-32(elo9) 111 


1 /90 /in- / ^^49)/+ 
1/ZU llrl-l J(/f JU¥Z//~ 


hn-13(n3673) 


f."« 7 3/Vt3/C73) 

lin-13(n3o/3) 


0/95 unr-12fel89)/+ III 


hn-13(n3674) 


lin-13(n3o/4) 




hn-13(n3726) 


Un-13(n3 /Jo) 


1/96 unr-llfplRQ)/* III 


hn-35(n3438) 


lin-35(n343o) 


0/^0 /7nv-^/>n'7)/+ 7 
u/ju upy-j \ wi j/* * 


lin-35(n3763) 


7 • 3CA»3 7>C3l 

lin-35(n3/03) 


u/zz (Ajjy-j\G\jij/ * i 


hn-36(n3671) 


7'.- 3^/.*3>C7l\ 

hn-3o(n3o/l) 


1/9^ unr-1')fo1RQ)/+ JIT 
1/Z3 unc-Jz\Gioyj/iiii 




1 m ~ 3>f/«3>C7'>\ 

Ilil-3o(n3o72) 


u/io ii7ic-j4\oi oyjt * m 


Un-36(n3765) 


lin-3o(n3/o3) 


O/0 1'){p1RQ\/<\' TTT 


Un-52(n3718) 


hn-52(n3718) 


1 *m^D9 7/4- 777* 

1/ 1 o mgrZl/'T ui 


lin-53(n3448) 


Un-53(n3448) 


1/22 dpy-5(eol)/+ 1 


lin-53(n3521) 


dpy-5(e61) I 


0/20 Un-53(n3521)/+ 


lin-53(n3622) 


dpy-5(e61)I 


5/30 lin-53(n3622)/+ 


lin-53(n3623) 


Un-53(n3623) 


4/\6hP4/+I 


Un-61(n3442) 


Un-61(n3442) 


0/20 dpy-5(e61)/+ 1 


Un-61(n3446) 


lin-61(n3446) 


\/23dpy-5/+I 
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New mutation 


Mutation used for 
selection of homozygous 
F 2 hermaphrodites 


Genotype of selected F 2 
hermaphrodites withrespect to 
the linked, unselected mutation 


Un-61(n3447) 


Un-61(n3447) 


0/13 dpy-5(e61)/+ 1 


Un-61(n3624) 


Un-61(n3624) 


0/15 dpy-5(e61)/+ 1 


Un-61(n3736) 


dpy-5(e61)I 


1/19 Un-61(n3736)/± 


Un(n3441) 


Un(n3441) 


5/20 dpy-5(e61)/+ 1 


Un(n3541) 


lin(n3541) 


9/31 dpy-5(e61)/+ 1 


Un(n3543) 


lm(n3543) 


9/27 dpy-5(e61)/+ 1 


Un(n3628) 


lin(n3628) 


1/29 dpy-5(e61)/+ 1 


lin(n3681) 


Hn(n3681) 


3/22 rol-4(sc8)/+ V 


mep-l(n3680) 


mep-l(n3680) 


0/30 dpy-20(el282)/+IV 


mep-l(n3702) 


mep-l(n3702) 


0/16 SP4/+IV 


mep-l(n3703) 


mep-l(n3703) 


0/16 SP4/+IV 


trr-l(n3630) 


rol-6(el87) II 


0/20 trr-l(n3630)/+ 


trr-l(n3637) 


rol-6(el87) II 


1/20 trr-l(n3637)/+ 


trr-l(n3704) 


rol-6(e!87) II 


1/30 trr-I(n3704)/+ 


trr-l(n3708) 


rol-6(el87) II 


0/20 trr-l(n3708)/+ 


trr-l(n3709) 


rol-6(e!87) II 


2/30 trr-I(n3709)/+ 


trr-l(n3712) 


rol-6(e!87) II 


1/19 trr-l(n3712)/+ 
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New mutation 


Criteria for X linkage 


Un(n3542) 


transmission test 


Un(n3707) 


transmission test 


gap-l(n3535) 


transmission test 


Un-15B(n3436) 


males with pseudovulva 


Un-15B(n3676) 


transmission test, males with pseudovulva 


Un-15B(n3677) 


males with pseudovulva 


lin-15B(n37U) 


males with pseudovulva 


lin-15B(n3760) 


transmission test, males with pseudovulva 


Hn-15B(n3762) 


males with pseudovulva 


lin-15B(n3764) 


transmission test, males with pseudovulva 


Un-15B(n3766) 


transmission test, males with pseudovulva 


Hn-15B(n3768) 


transmission test, males with pseudovulva 


lin-15B(n3772) 


transmission test, males with pseudovulva 


sli-l(n3538) 


transmission test 


sli-l(n3544) 


transmission test 


sli-l(n3683) 


transmission test 



Autosomal and sex chromosome linkages were determined as described 
5 below. Iin(n3541) was also mapped relative to bli-3(e767) and unc-54(el092) 9 
mutations present on the extreme left and right arms, respectively, of linkage 
group I. Of 16 Muv progeny selected from a Un(n3541) / bli-3(e767) unc- 
54 (el 092); Un-15A(n767) parent, none were bli-3(e767)/+ whereas six were 
unc-54(el092)/+, indicating lin(n3541) lies nearer to bli-3(e767). 

10 We then determined if a given mutation failed to complement mutations 

of known synMuv genes on the same linkage group. Mutations that were not 
assigned to known synMuv complementation groups were tested against 
unassigned mutations within the same linkage group for complementation. 
These tests defined seven new synMuv loci: /rr-7, mep-1, lin(n3441), 

15 Hn(n3628), Un(n3681), Un(n3707), and lin(n3542). We used three-factor 
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crosses to map most of tibese new synMuv genes within their respective linkage 
groups (Table 3). 

Table 3 Map data for newly-identified synMuv loci 



A. Three- and four-factor mapping 



Gene 



Genotype of heterozygote 



Genotype of selected 
Phenotype of recombinants (with 
selected respect to unselected 
recombinants markers) 



ark-1 



gap-1 



+ + ark-1 /unc-5 dpy-20 +; Un-15A(n767) 



+ ark-1 + / dpy-20 + unc-30; Un-15A(n767) 



Unc 
Dpy 
Dpy 
Unc 



dpy-20 + ark-1 + / + dpy-26 + unc-30; lin- 
15A(n767) 



dpy-20 + + ark-1 / + lin-3 unc-22 +; Un-15A(n767) Dpy 

Muv 

dpy-20 + ark-1 +/+ unc-22 + unc-30; lin- Dpy 
15A(n767) 

Muv 
Unc-22 
Unc-30 
Dpy-20 

Muv 

Unc 

Dpy 
Unc 

Lon 
Unc 



+ + gap-1 Un-15A(n767)/unc-l dpy-3 + lin- 
15A(n767) 

gap-1 + + Un-15A(n767) /+ unc-2 lon-2 lin- 
15A(n767) 

+ gap-1 + Un-15A(n767) /dpy-3 + unc-2 lin- 
15A(n767) 



10/10 ark-1 / + 
0/1 ark-1 / + 
15/35 ark-1/ + 
17/33 ark-1 / + 
3/9 unc-22 / + 
3/3 unc-22 / + 
1/3 unc-22 / + 

1/2 unc-22 / + 
2/3 ark-1 / + 
5/6 ark-1 / + 
All dpy-26 / + 

3/8 dpy-26 /+ 

17/17 gap-1 / + 

0/8 gap-1 / + 
0/2 gap-1 / + 

6/6 gap-1 / + 
14/18 gap-1 / + 



lin-52 
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Gene 



Genotype of heterozygote 



Genotype of selected 
Phenotype of recombinants (with 
selected respect to unselected 
recombinants markers) 



Un(nS441) 



Un(n3628) 



Un(n3542) 



mep-1 



+ lin-52 + /unc-16 + unc-47; lin-1 5A(n767) 
lin-52 + unc-69/ + slP127 +; lin-1 5A(n767) 
sma-3 + lin-52 +/ + sqv-3 + unc-69; lin- 
15A(n767) 



+ Un(n3441) +/bli-3 + lin-1 7; lin-1 5A(n767) 
bli-3 + lin(n3441) / + spe-15 +; lin-1 5A(n767) 
+ Hn(n3441) lin-17 / spe-15 + +; Un-15A(n767) 

Un(n3628) + + /+ dpy-5 unc-13; lin-1 5A(n767) 

+ Hn(n3628) +/unc-ll + dpy-5; lin-1 5A(n767) 

unc-11 + + Un(n3628)/+ unc-73 lin-44 +; lin- 
15A(n767) 

+ + Hn(n3628) dpy-5 /unc-73 lin-44 + +; lin- 
15A(n767) 

Un(n3628) + dpy-5 /+ unc-38 +; lin-1 5A(n767) 
unc-U Hn(n3628) +/+ + unc-38; lin-1 5A(n767) 

+ + + Un(n3542) lin-1 5A(n7 67) /unc-10 dpy-6lin- 
15A(n767) 

+ Un(n3542) + lin-1 5A(n767) / dpy-6 + unc-9 lin- 
15A(n767) 

+ mep-1 + / unc-5 + dpy-20; lin-1 5A(n767) 

mep-1 + +/ + dpy-20 unc-30; lin-1 5A(n767) 

+ + mep-1 +/unc-24 mec-3 + dpy-20; lin- 
15A(n767) 
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Unc-47 7/9 lin-52 / + 

Muv 3/12 S1P127/ + 

Sma 9/9 sqv-3 / + 

Muv 1/27 sqv-3 / + 

Unc 14/16 lin-52 / + 

Lin-17 9/19 Un(n3441) / + 

Muv 10/185^-75/ + 

Lin-17 11/11^-75/ + 

Dpy 0/6 Hn(n3628) / + 

Unc 6/6 Hn(n3628) / + 

Unc VU Un(n3628) / + 

Dpy 5/11 lin(n3628)/ + 

MuV 3/9 unc- 73 lin-44 /+ + 

Muv 0/2 1 unc- 73 lin-44 /+ + 

Muv 3/7 unc-38 / + 

Muv 0/9 unc-38 / + 

Unc miin(n3542)/ + 

Unc 4/40 Un(n3542) / + 



Unc 56/57 mep-1 / + 

Dpy 2/61 mep-1 / + 

Dpy 0/51 mep-1 / + 

Unc 58/58 mep-1 / + 

UncMec 10/12 we/?-i / + 
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Gene 



Genotype of heterozygote 



Genotype of selected 
Phenotype of recombinants (with 
selected respect to unselected 
recombinants markers) 



sli-1 



trr-1 



+ mep-I dpy-20 + /lin-3 + + unc-22; lin- 
15A(n767) 

+ + mep-1 + /mec-3 sem-3 + dpy-20; lin- 
15A(n767) 



sli-1 + + lin-15A(n767) / + lon-2 unc-6 lin- 
15A(n767) 

sli-1 + + Hn-15A(n767) / + unc-2 lon-2 lin- 
15A(n767) 

sli-1 + + lin-15A(n767)/+dpy-3 unc-2 lin- 
15A(n767) 

sli-1 + + Un-15A(n767)/ + unc-l dpy-3 lin- 
15A(n767) 



+ rol-6 + trr-1 /dpy-10 + unc-4 +; Un-15A(n767) 



+ trr-1 +/ dpy-10 + rol-1; lin-15A(n767) 
+ + trr-1 /dpy-10 unc-53 +; Un-15A(n767) 
+ trr-1 +/ unc-53 .+ rol-1; lin-15A(n767) 

+ trr-1 + rol-1 /unc-4 + mex-1 +; Un-15A(n767) 



Unc 

MecDpy 

Dpy 

Dpy 

Vul 
Mec 

Dpy 

Lon 

Lon 

Dpy 

Unc 
Unc 

Dpy 

Rol 

Dpy 

Unc 

Rol 

Unc 

Unc 

Rol 

Rol 



\lWmep-l/ + 
0/%mep-l/ + 
2/8 mep-1 / + 
5/5 lin-3 / + 

3/10 mep-1 / + 
17/17 mep-1 / + 

6/13 mep-1 /+ 

0/6 sli-1 / + 

5/5 sli-1 / + 

0/10 *//-// + 

6/65/1-// + 
0/14*//-// + 

10/10 sli-1 / + 

3/14 unc-4 / + 
3/3 trr-1 / + 
0/8//T-// + 
9/20&T-// + 
0/17 trr-1 / + 
1/10 trr-1 / + 
7/10 trr-1 / + 
12/14 mex-1 / + 



B. Deficiency mapping 



Gene Genotype of heterozygote 

lin-52 



Phenotype of heterozygote 
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mep-1 



trr-1 



unc-36 Un-52/nDf40 dpy-18; Un-15A(n767) Muv 

mep-1 /sDf63 unc-31; Un-15A(n767) / + PvlSte 

mep-1 /sDf62 unc-31; Un-15A(n767) / + * PvlSte 
mep-1 ZsDflO; Un-15A(n767) /+ WT 

rol-6 ttr-1 /mnDf57; lin-15A(n767) WT 
rol-6 /unc-4 mnDf90; Un-15A(n767) WT 
rol-6 trr-1 /mnD/29; Un-15A(n767) WT 

trr-1 /unc-4 mnDf87; Un-15A(n767) Muv 



WT: wild-type; Pvl: protruding vulva; Ste: sterile. 

Three- and four-factor crosses were performed using standard methods 
(Brenner, Genetics 11: 71-94, 1974). Deficiency heterozygotes were 
constructed as described below. In addition, we have isolated trr-1, mep-1, 
5 lin(n3628), and Un(n3681) mutations away from the parental Hn-15A(n767) 
mutation, mep-1, Hn(n3628), and Hn(n3681) mutations alone do not cause a 
Muv phenotype, and trr-1 mutations alone cause only weak ectopic vulval 
induction. Thus, these mutations synergize with lin-15A(n767) and are indeed 
synMuv mutations. 
10 We identified mutations in gap- 1 and sli-1, two genes that were 

originally identified in screens for mutations that suppressed the Vul phenotype 
caused by a reduction in let-60 Ras pathway signaling (Jongeward et aL, 
Genetics 139: 1553-66, 1995; Hajnal et al., Genes Dev\\\ 2715-28, 1997). We 
also identified mutations in ark-1, a gene that was first identified in a screen for 
15 mutations that caused ectopic vulval induction in a sli-1 mutant background 
(Hopper et al., Mol Cell 6: 65-75, 2000). gap-1, sli-1 \ and ark-1 single mutants 
were previously isolated and found to have no (sli-1, gap-1) or subtle (ark-1) 
defects in vulval development. Our results indicate that sli-1, gap-1, and ark-1 
act redundantly with lin-15A to negatively regulate let-60 Ras signaling. 
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Molecular identification of mep-1 

We isolated three mutations, n3680, nil 02 and n3703, in a gene that we 
mapped to a small interval on linkage group IV in between sem-3 and dpy-20 
as shown in Figure 1. We attempted to rescue the Muv phenotype of n3680; 

5 lin-15A(n767) mutants using cosmid clones from this interval. Transgenic 
animals containing the cosmid M04B2 were rescued for the Muv phenotype 
and also showed improved fertility relative to non-transgenic animals. The 
genomic sequence of mep-1 is shown in Figure 2. The mep-1 open reading 
frame sequence is shown in Figure 3. This gene was originally identified based 

1 0 on its interaction with the germline specification genes mog-1 , mog-4, mog-5 
andpie-1 in yeast two-hybrid screens (Belfiore et al. RNA. 8:725-39, 2002). 
Because somatic tissues adopt germ cell-specific characteristics in mep-1 
mutants, mep-1 is thought to repress germ cell fates in the soma. We 
sequenced mep-1 in our mutant strains to determine if the mutations we 

1 5 isolated affected this gene. These mutations identify functionally important 
amino acid residues or domains. n3680 mutants have a missense mutation that, 
in the predicted MEP-1 protein, changes a polar serine residue to an asparagine. 
n3702 mutants have a nonsense mutation and n3703 mutants a splice acceptor 
mutation in the mep-1 gene. Our genetic mapping data, cosmid rescue, and 

20 DNA sequence results indicate that n3680, n3 702, and n3 703 are mep-1 
mutations. 

The deduced amino acid sequence of MEP-1 is shown in Figure 4. 
7iiep-l encodes a protein containing six zinc-finger motifs. Zinc fingers are 
known to mediate interactions of proteins with DNA and with other proteins. 
25 The zinc fingers of MEP-1 likely mediate interactions with LET-4 1 8 or other 
synMuv proteins. 

Sequences of synMuv mutations 

We determined sequences of mutations that affected additional synMuv 
30 genes (Table 4). 
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Table 4 Selected synMuv proteins and allele sequences 



A. Features of selected synMuv proteins 



Protein 



No. amino acids 



Protein similarities and domains 



DPL-1 



EFL-1 



LET418 



r 

LIN-9 
LIN-13 



LIN-35 
LIN-36 

LIN-52 

LIN-53 

L1N-61 
MEP-1 

SLI-1 
TRR-1 



598 



342 



1829 

LIN-9L: 644 
LIN-9S: 642 
2248 



961 
962 

161 

417 

491 
853 

582 
4064* 



Similar to DP family transcription factors; Contains 
DNA- and E2F-binding domains 
Similar to E2F family transcription factors; 
Contains DNA-binding, DP-binding and 
transactivation domains 

Similar to Mi-2 family ATP-dependent chromatin 
•remodeling enzymes; Contains chromodomains, 
PHD finger motifs and a belicase domain* 
Similar to Drosophila Aly cell cycle regulator and 
mammalian proteins of unknown function 
Protein has 24 Zn-finger motifs 
Similar to Retinoblastoma (pRb) family 
transcriptional regulators; Contains "pocket" 
interaction domain 

Novel protein with C/H-rich and Q-rich regions 
Similar to Drosophila and mammalian proteins of 
unknown function 

Similar to Drosophila p55, mammalian RbAp48 
subunits of chromatin remodeling and histone 
deacetylase complexes; Contains WD repeats 
Similar to Drosophila l(3)mbt and other MBT 
repeat-containing proteins 
Protein has six Zn finger motifs 
Similar to Cbl family ubiquitination-promoting 
proteins; Contains SH2 domain and RING finger 
motif 

Similar to mammalian TRRAP transcriptional 
regulator 
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B. Allele sequences 



Mutation 



Wild-type 
sequence 



Mutant 
sequence 



Substitution, splice 
site change or 
aberration 



Domain 
affected by 
missense 
mutation 



dpl-l(n3643) 
efl-l(n3639) 
let- 

418(n3536) 
let- 

418(n3626) 
let- 

418(n3629) 
let- 

418(n3634) 
let- 

418(n3635) 
let- 

418(n3636) 
let- 

418(n3719) 
Un-9(n3631) 



TAT 
CAA 

CCT 

GGT 

TCC 

TGG 

CAG 

ACT 
TGG 

TGG 
CAA 



Un-9(n3675) GAT 
Hn-9(n3767) CAG 
lin- 

13(n3642) CAT 
lin- 

13(n3673) CAG 
lin- 

13(n3674) CGA 
lin- 

13(n3726) GGA 



TAA 
TAA 

CTT 

AGT 

TTC 

TAG 

TAG 

TCT 
TGA 

TAG 
TAA 

AAT 

TAG 

TAT 
TAG 
TGA 
GAA 



Y341ochre 
Q175ochre 

P675L 

G1006S 

S925F 

W1128amber 

Q1594amber 

T807S 
W1329opal 

W295amber 
LIN-9L: Q594ochre 
L1N-9S: Q592ochre 
LIN-9L: D305N 
LIN-9S: D303N 
LIN-9L: Q509amber 
L1N-9S: Q507amber 

H832Y 

Q1988amber 

R1250opal 

G229E 



helicase/ATPase 
helicase/ATPase 
helicase/ATPase 



helicase/ATPase 



none predicted 
none predicted 



Zn finger 



none predicted 
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Mutation 



Wild-type 
sequence 



Substitution, splice 
Mutant site change or 
sequence aberration 



Domain 
affected by 
missense 
mutation 



lin- 

35(n3763)° GCA 



Un- 

36(n3671) 
lin- 

36(n3672) 
lin- 

36013765? 
lin- 

52(n3718) 
lin- 

53(n3448) 
lin- 

53(n3521) 
lin- 

53(n3622) 
lin- 

53(n3623) 
lin- 

61(n3442) 
lin- 

61(n3446) 
tin- 

61(n3447) 
lin- 

61(n3624) 



TTG AAA 
AAG 

CAT 
GAA 

CAG 

GCT 



CAG 



AGT 



GTA 

TTG AAA 
AAA G 

CCT 
AAA 

TAG 

GTT 

TAG 

ATT 



GAA AAA 

AAG/atatgtgt 
(SEQID 

AAG/gtatgtgt NO:30) 

TGG TAG 

aacttcaa/AAT 
(SEQID 

aacttcag/AAT NO:31) 



CAA 



AGT 



£CG 



TAA 



AAT 



TCG 



A555V 

K594frameshift and 
truncation after 
611a.a. 

H284P 
E424K 

Q467amber 

A242V 

Q31 amber 



S3841 



E174K 



Exon 1 donor 



W337amber 



Exon 4 acceptor 



Q412ochre 



S354N 



P132S 



Pocket 



C/H-rich region 
none predicted 



C/H-rich region 



WD repeat 



WD repeat 



MBT repeat 
none predicted 
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Mutation 



Wild-type 
sequence 



Mutant 
sequence 



Substitution, splice 
site change or 
aberration 



Domain 
affected by 
missense 
mutation 



10 



15 



lin- 

61(n3736) 

mep- 

](n3680) 

mep- 

l(n3702) 

mep- 

-l(n3703) 
sli-l(n3538) 



TTT 



AGT 



CAG 



CTT/gtaagttt 
TCA 



sli-l(n3544) ttttccag/AAA 



TCT 
AAT 
TAG 

CTT/ataagttt 
(SEQID 
NO:32) 
TTA 

ttttccaa/AAA 

(SEQID 

NO:33) 

ttttttaa/GAT 

(SEQID 



sli-l(n3683) ttttttag/GAT NO:34) 
trr-l(n3630) TGG TAG 



trr-l(n3637) CAG 

trr-l(n3704) CAA 

trr-l(n3708) CGA 

trr-l(n3709) CGA 

trr-l(n3712) TGG 



TAG 
TAA 
TGA 
TGA 
TAG 



F247S 



S309N 



Q706amber 



Exon 3 donor 
S305L 



Exon 6 acceptor 



Exon 4 acceptor 

W2064amber 

Q3444amber 

Q694ochre 

R1248opal 

R2550opal 

W2505amber 



MBT repeat 



none predicted 



SH2 



In the "Wild-type sequence" and "Mutant sequence" columns, exon and intron sequences are 
denoted by uppercase and lowercase script, respectively. Nucleotides altered by mutation 
are underlined. 

* The predicted LET-41 8 protein contains a sequence described as a helicase domain. This 
domain was originally identified in helicases, but has since been found in non-helicase 
proteins. Many of these proteins share a common ATPase activity, and this domain contains 
residues that are important for ATP binding and hydrolysis. 

■ The adenosine inserted by the Iin-35(n3763) frameshift mutation is not underlined because it 
is unclear which nucleotide in the adenosine repeat was inserted. 

* In addition to the missense mutation described, we found an additional mutation associated 
with Hn-36(n376S). This mutation, AG/gtaagaagaaaagc to AG/gtaagaagaaaagt, is present in 
the third intron of lin-36 and creates a possible splice donor sequence. If this splice donor 
were used, an inframe ochre (TAA) stop codon would be encountered, truncating the LIN-36 
protein after 261 amino acids. 

* Due to alternative splicing, trr-1 encodes proteins that range in length between 4051 and 
4061 amino acids 

DPL-1 and EFL-1 are described by (Ceol et ah, Mol Cell 7: 461 -73, 200] and (Page et al., Mol 
Cell 7: 45 1 -60, 2001). LIN-9 is described by Beitel et al., Gene 254: 253-63, 2000); LIN-1 3 is 
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described by Melendezetal., Genetics 155: 1127-37, 2000);; LIN-35 andLIN-53 are described by 
(Lu et al. Cell 95:981-91, 1998); LIN-36 is described by (Thomas et al., Development 126: 3449-59, 
1999); and SLI-1 is described by (Yoon et al., Science 269: 1 102-5, 1995). 

5 Most mutations are GC-to-AT transitions that are characteristic of EMS 

mutagenesis (Anderson, Methods Cell Biol pp. 31-58, 1995). Many of these 
mutations are predicted to truncate the corresponding synMuv proteins. The 
truncations predicted by efl-J (n3639), let-418(n3719), and Un-52(n3718) are 
particularly severe, and the synMuv and sterile phenotypes caused by these 

10 mutations may represent the null phenotypes of these genes. In addition, we 
found missense mutations that disrupt predicted functional domains of synMuv 
proteins. For example, n3536, n3626, n3629 and one of the two mutations of 
n3636 affect the ATPase/helicase domain of LET-418. LET-41 8 is a member of 
the Mi-2 family of ATP-dependent chromatin remodeling enzymes (Solari et al., 

15 Curr Biol 10: 223-6, 2000; Von Zelewsky et al., Development 127: 5277-84, 
2000), and the LET-418 missense mutations suggest that LET-418 function is 
similarly dependent on ATP hydrolysis. At least one mutation affecting the 
LIN- 13 protein, n3642 } is predicted to disrupt a canonical zinc-finger motif. 
This missense mutation indicates that at least some of the twenty-four LIN- 13 

20 zinc fingers are important for its synMuv activity. Missense mutations affecting 
other synMuv proteins are not as easily linked to the disruption of predicted 
functional domains. These mutations may provide a useful starting point in 
identifying functional motifs within synMuv proteins that are not predicted by 
sequence comparisons. 

25 

Frequency of mutant isolation 

The'rate at which we isolated mutations was much higher than that 
observed in previous synMuv screens: including those 63 mutations described 
in this study, we recovered one synMuv mutation per 107 haploid genomes 
30 screened versus 1/750 (Ferguson et al., Genetics 123: 109-21, 1989), 1/400 and 
1/667 in previous screens. We believe the reasons for this difference are 
threefold. First, our screen design allowed the isolation of synMuv mutations 
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that also caused sterility. Sterile synMuv mutants were observed previously, 
but because the heterozygous siblings of these mutants were present in a sea of 
genotypically unrelated animals, the underlying mutations could not be 
recovered. Second, our parental strain carried the strong class A mutation, 

5 Hn-15A(n767). The penetrance of a strain's Muv phenotype is dependent on 
the aggregate strengths of the component synMuv mutations. Therefore, even 
weak mutations may be identified in a strong synMuv background such as 
Un-15A(n767). Although we have not formally tested this possibility, we 
believe that some of the mutations we recovered only weakly affect synMuv 

10 ... .activity . .. Suckmutations may not have been recovered in previo.us.screens that 
were performed in partial loss-of-function synMuv backgrounds. Third, in 
screening a plate of many F 2 progeny derived from a single Fi animal, we 
observed many genotypically identical animals per haploid genome screened. 
This type of screening likely accounts for our isolation of a number of partially 

1 5 penetrant synMuv mutations. Such mutations may not have been identified in 
earlier synMuv screens that typically observed fewer genotypically identical 
animals per haploid genome screened. 

Our high rate of recovery indicates many genes can mutate to a synMuv 
phenotype. Including the ten genes we identified in this study, a total of 25 

20 genes can act redundantly with class A synMuv genes. Many of these genes 
are represented by one or a few mutant alleles, indicating that screens for 
synMuv genes are not saturated. 

The synMuv genes we identified likely act in different pathways 

25 Class B synMuv mutations synergize with class A synMuv mutations, 

but not with other class B synMuv mutations. Such genetic behavior led to the 
hypothesis that class B synMuv genes are part of a single genetic pathway 
(Ferguson et al., Genetics 123:109-21, 1989). In support of this hypothesis, 
mutations affecting different class B synMuv genes are similarly suppressed by 

30 loss-of-function mutations in the let-23 receptor tyrosine kinase and other 
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let-60 Ras pathway loss-of-ftraction mutations (Ferguson et al., Nature 
326:259-67, 1987), a subset of class B synMuv gene products have been shown 
to interact in viti'o, and their homologs are known function together in other 
systems (Lu et al., Cell 95: 981-91, 1998; Ceol et al., Mol Cell 7: 461-73, 

5 200 1). Because we conducted our screen in a class A synMuv background, we 
anticipated recovering mutations that affected genes of the class B synMuv 
pathway. In addition to Class B synMuv mutations, our results suggest that we 
recovered mutations that disable distinct genetic pathways. We recovered six 
mutations that affect the trr-1 gene. Unlike typical class B synMuv mutations, 

10 trr-1 (n3 712) synergize not only with class A synMuv mutations, but also with 
class B synMuv mutations, trr-1 (n37 12) single mutants also atypically show 
ectopic vulval induction. Because of its unusual genetic interactions, we 
propose that trr-1 functions in a pathway distinct from the class B synMuv 
pathway. We also recovered mutations affecting the sli-1, gap-1, and ark-1 

15 genes. These genes were previously characterized as negative regulators of 
let-60 Ras pathway activity, acting genetically downstream of the let-23 
receptor tyrosine kinase (Jongeward et al., Genetics 139: 1553-66, 1995; 
Hajnal, et al., Genes Dev 11: 2715-28 1997; Hopper et al., Mol Cell 6: 65-75, 
2000). The molecular identities of sli-1, gap-1, and ark-1 support their action 

20 downstream of let-23. sli-1 encodes a homolog of the c-cbl proto-oncoprotein, 
which is thought to downregulate receptor tyrosine kinase levels through 
ubiquitin-mediated degradation (Yoon et al., Science 269: 1 102-5, 1995; 
Levkowitz et al., Mol Cell 4: 1029-40, 1999). gap-1 is a member of the 
GTPase-activating protein family (Hajnal, et al., Genes Dev 11: 2715-28 1997). 

25 GAPs enhance the catalytic function of Ras family GTPases, thereby 

facilitating the switch from active GTP-bound to inactive GDP-bound Ras. 
ark-1 encodes a predicted cytoplasmic tyrosine kinase that interacts with the 
SEM-5 SH2/SH3 adaptor protein (Hopper et al., Mol Cell 6: 65-75, 2000). 
Since sem-5 acts downstream of the let-23 receptor tyrosine kinase, ark-1 is 

30 proposed to inhibit let-60 Ras signaling downstream of let-23. These genetic 
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and molecular data suggest that sli-1, gap-1, and ark-1 directly regulate let-60 
Ras pathway members and are likely not part of the canonical class B synMuv 
pathway, which is thought to regulate the let-60 Ras pathway either upstream 
of, or in parallel to, the let-23 receptor tyrosine kinase. We are currently 
5 placing our synMuv mutations into different genetic classes by examining 
interactions with class B synMuv and let-23 mutations. 

lin-52 encodes a new putative Rb pathway protein 

lin-35, a member of the class B synMuv pathway, encodes a protein 

10 similar to the mammalian-tumor suppressor pRb (Lu st aL^Cell 95: 981-91, 
1998). Other genes with class B synMuv activity encode DP, E2F, RbAp48, 
histone deacetylase and HP1 family proteins (Lu et ah, Cell 95: 981-91, 1998; 
Ceol et al., Mol Cell, 7: 461-73, 2001; Couteau et al., EMBO Rep 3: 235-41, 
2002). Mammalian homologs of these proteins are known to functionally, and 

1 5 in some cases physically, interact with pRb. These and other parallels indicate 
that the class B synMuv pathway is an analog of Rb pathways in other systems. • 
Consequently, additional class B synMuv genes may have homologs with 
analogous functions in other systems. One such gene is lin-52. By the genetic 
criteria outlined above, lin-52 is a class B synMuv gene, lin-52 mutations 

20 synthetically interact with class A mutations, but not with class B mutations. 
Furthermore, preliminary experiments indicate that the Vul phenotype of a 
let-23 loss-of-function mutation is epistatic to the Muv phenotype caused by 
lin-52 and lin-15A loss of function, lin-52 encodes a small protein, portions of 
which are conserved in similarly small proteins predicted by the human, mouse 

25 and Drosophila genome sequences. The characterization of these and other 
class B synMuv protein homologs should help to determine whether they too 
function in Rb-mediated signaling. . 

The experiments described above were carried out as follows 

30 
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Strains and general techniques 

Strains were cultured as described by (Brenner, Genetics 77: 71-94, 
1974). and grown at 20°C unless otherwise indicated. The wild-type parent of 
all the strains described in this study was the Caenorhabditis elegans Bristol 
5 strain N2. For some two and three-factor mapping experiments we used the 
polymorphic strain RW7000 

(Williams et al., Genetics 131: 609-24, 1992). We also used strains containing 
the following mutations: 

LGI: bli-3(e767), Hn-17(n677), unc-ll(e47), unc-73(e936), Un-44(nl792), 
10 unc-38(x20), dpy^e61)Jin-35(n745), Hn-61(sy223), unc-J3(el09.1), 

Un-53(n833) (Ferguson et al., Genetics 123: 109-21 (1989), unc-54(el092) 
(Dibb et al., J. MolBiol 183: 543-51, 1985). 

LGII: Un-31(n301), dpy-10(el28), tra-2(q276), rol-6(el87), dpl-l(n2994), 
unc-4(el20), unc-53(n569), mex-l(it9), rol-l(e91) 
15 LGIH: dpy-17(el64), lon-l(e!85), sma-3(e491), Un-13(n770) (Ferguson et al., 
Genetics 123: 109-21 (1989), Un-37(n758), Hn-36(n766), unc-36(e251), 
Hn-9(nll2), unc-32(el89), unc-16(el09), sqv-3(n2842), Hn-52(n771) 
(Ferguson et al., Genetics 123: 109-21 (1989), unc-47(e307), unc-69(e587), 
dpy-18(e364) 

20 LGIV: Un-l(el275), unc-5(e53), unc-24(el38), mec-3(el338), Un-3(n378), 
sem-3(nl900), dpy-20(el282),unc-22(e66), dpy-26(nl98), unc-31(el69), 
unc-30(el91), Un-54(n2231), dpy-4(eU66)LGV: t.am-l(cc567) (Hsieh et al., 
Genes Dev 13: 2958-70, 1999), unc-46(el77), let-418(sl617), dpy-ll(e224), 
rol-4(sc8), unc-76(e911), efl-l(n3318) Ceol et al., Mol Cell 7: 461-73 (2001). 

25 dpy-21(e428) LGX: sli-l(syl43), aex-3(ad418), unc-l(el598nl201), 
dpy-3(e27), gap-l(gal33) (Hajnal et al., Genes Dev 11: 2715-28, 1997), 
unc-2(e55), lon-2(e678), unc-10(el02), dpy-6(el4), unc-9(el01), unc-3(el51), 
Hn-15A(n767), Hn-15AB(n765). Unless otherwise noted, the mutations used 
are described by (Riddle et al., C. elegans //(Cold Spring Harbor, New York, 

30 Cold Spring Harbor Laboratory Press 1997). In addition, we used strains 
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containing the following chromosomal aberrations: mnDf57 II (Sigurdson, et 
al., Genetics\0%: 331-45, 1984), mnDflO II (Sigurdson, etal., GeneticslOZ: 
331-45, 1984), mnDf29 II (Sigurdson, et al., Generics 108: 331-45, 1984), 
mnDf87II (Sigurdson, etal., GeneticslOZ: 331-45, 1984), 
5 mlnl[dpy-10(el28)mlsl4] II (Edgley et al., Mol Genet Genomics 266: 385-95, 
2001), mnCl[dpy-I0(el28) unc-52(e444)] //(Herman, Genetics 88: 49-65, 
1978), nDf40 III (Hengartner et al., Nature 356: 494-9, 1992), 
qCl[dpy-19(el259)glp-l(q339)] ///(Austin, et al., Cell 58: 565-571, 1989), 
sDf63 IV, sDf62 W (Clark et al., Mol Gen Genet 232: 97-105, 1992), sDflO W 
. 1 0 (Rogalski et al., Genetics 1 02: 725-36, 19.82),. etl (III; V) (Rosenbluth et al., 
Genetics 99: 415-28, 1981), nTl(IV;V) (Ferguson et al., Genetics 110: 17-72, 
1985). mis 14, an integrated transgene linked to the chromosomal inversion 
mini, consists of a combination of GFP-expressing transgenes that allow 
77tfrJ4-containing animals to be scored beginning at the 4-cell stage of 
1 5 embryogenesis (Edgley et al., Mol Genet Genomics 266: 385-95, 2001). 

Isolation of new alleles 

We mutagenized Hn-15A(n767) hermaphrodites with ethyl 
methanesulfonate (EMS) as described by (Brenner, Genetics 11: 71-94, 1974). 

20 We allowed these animals to recover on food for between 15 minutes to one 
hour, and then transferred individual P 0 larvae, in L4 lethargus to 50 mm plates. 
After three to five days, 20 Fi L4 larvae per P 0 were individually transferred to 
50 mm plates, and, subsequently, F 2 animals on these plates were screened for 
a Muv phenotype. We screened the progeny of 3380 F] animals using this 

25 procedure. 

Linkage group assignment 

We used the following markers to determine linkage of newly isolated 
synMuv mutations to autosomes: dpy-5 1, rol-6 II, unc-32 III, dpy-20W, rol-4 
30 V. We generated animals heterozygous for the new synMuv mutation and for 
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at least two of these markers. For fertile synMuv mutants we picked Muv 
progeny and determined if these progeny segregated the markers, whereas for 
sterile synMuv mutants we picked single marker homozygotes and determined 
if these animals segregated the synMuv mutation. We also mapped some 

5 mutations using polymorphisms present in the RW7000 strain. We generated 
animals heterozygous for the new synMuv mutation and for RW7000 markers. 
We picked individual Muv progeny of these animals, performed lysis and used 
the resulting template DNA to monitor linkage to each of the autosomes by 
PCR (Williams et al., Genetics 131: 609-24, 1992). We tested for sex linkage 
. 10 to assign some new synMuv mutations to the.X.chromosome.. .Briefly, we 
generated heterozygous or hemizygous mutant males and mated them with 
marked Un-15A(n767) hermaphrodites. We then determined whether all, 
indicating sex linkage, or roughly half, indicating autosomal linkage, of the 
cross progeny hermaphrodites of this mating segregated the synMuv mutation. 

15 Some lin-15B mutations were not tested for sex linkage. Instead, we 

tentatively assigned X-chromosome linkage based on the presence, when 
Un-J5A(n767) males were mated with these mutants, of cross-progeny males 
with pseudovulval ventral protrusions. Such protrusions are often observed in 
hemizygous Iin-15AB mutant males (Ferguson et al., Genetics 110: 17-72, 

20 1985) but are found at a much lower penetrance in Un-15A(n767) males that are 
hemizygous for an X-linked synMuv mutation affecting genes other than 
lin-15B. The mutations we assigned in this manner were later determined by 
complementation tests to affect lin-15B. 



25 Complementation tests 

We typically performed complementation tests by mating males 
heterozygous for the new mutation and hemizygous for Un-15A(n767), or, if X- 
linked, males hemizygous for both the new mutation and Un-15A(n767), into 
marked synMuv mutant hermaphrodites, all of which contained a lin-15A 
30 mutation. Hemizygous lin-15B(n371 l)lin-15A(n767) males could not mate. 
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To perform complementation tests with this mutation, we mated tra-2(q276); 
Un-15B(n3711)lin-15A(n767)/++ XX males into marked lin-15AB 
hermaphrodites. For new mutations that caused recessive sterility, we 
generated heterozygous males by starting matings with wild-type L4 males and 
5 individual gravid, putative heterozygous mutant hermaphrodites. For 

complementation tests we used cross-progeny males derived from plates that 
had self-progeny Muv animals present. In all complementation tests, unmarked 
cross-progeny hermaphrodites were scored. 

10 .Construction of deficiency heterozygotes^ ..... — 

To construct trr-l(n3712) heterozygotes with mnD/57, mnDfPO and 
mnDf29, Df/mlnl; Un-15A(n767) males were generated. These males were 
mated into roU6 trr-l(n3712)/mlnl; lin-15A(n767) hermaphrodites and 
non-Rol, non-Gfp cross-progeny were scored. mnDJ87 heterozygous males do 
15 not mate so in this case we generated Un(n37 12)/mnDf87 ; Un-15A(n767) 
animals by mating Un(n3712)/mlnl; Hn-15A(n767) males into unc~4 
mnDf87/mInl; Un-15A(n767) hermaphrodites. To construct the lin-52 
heterozygote with nDf40, we mated nDf40 dpy-18/unc-36; Hn-15A(n767) 
males into unc-36 Un-52(n771); Un-15A(n767) hermaphrodites and scored non- 
20 Unc cross-progeny. mep-l/Df animals were constructed by mating Df/nTl; 
+MT1 males into dpy-20 mep-J; Hn-15A(n767) hermaphrodites and scoring 
non-Dpy cross-progeny. 

Transgenic animals 

25 Germline transformation was performed, as described by (Mello et al., 

Embo J 10: 3959-70, 1991), by injecting cosmid (5-10 ng/juL) or plasmid (50- 
80 ng/jiL) DNA into lin-52 or mep-1 mutants. Either pRF4, which causes a 
dominant Rol phenotype, or pPD93.97, which expresses gfp under the control 
of the myo-3 promoter, was used as a coinjection marker. 

30 
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lin-52 cDNA isolation 

We obtained a partial lin-52 cDNA clone, yk253bl2, that included 249 
nucleotides of the lin-52 open reading frame and also included the 3' 
untranslated region and a polyA tail. We used the 5' RACE system v2.0 for 
5 rapid amplification of chromosome ends (GIBCO-BRL, LIFE 

TECHNOLOGIES, Inc. Gaithersburg, Maryland) to determine the 5 ' end of the 
lin-52 transcript. We ligated the two portions of the lin-52 cDNA together to 
generate a full-length cDNA clone. The lin-52 5' RACE products were trans- 
spliced to the SL2 leader sequence consistent with observations made by (Zorio 
.10 . etal.,J\ta«re 372: 270-2, 1994). 

Allele sequence 

We used PCR-amplified regions of genomic DNA as templates in 
determining gene sequences. For each gene investigated, we determined the 
15 sequences of all exons and splice junctions. Whenever observed, the sequence 
of a mutation was confirmed using an independently-derived PCR product All 
sequences were determined using an automated ABI 373 DNA sequencer. 

Example II 

20 As detailed below, we have identified a distinct class of genes, termed 

the class C synMuv genes, that negatively regulate vulval induction. 

Proper vulval development in the nematode C. elegans requires that 
specific ectodermal cells, termed Pn.p cells, adopt different cell fates. The 
specification of Pn.p cells that eventually make vulval tissue occurs in two 

25 steps, each of which involves the selection of a subset of Pn.p cells from a 

larger Pn.p field (Sulston, Dev Biol 56: 1 10-56, 1977). In the first step, which 
occurs in the LI larval stage shortly after the Pn.p cells are generated, anterior 
and posterior Pn.p cells fuse with the syncytial hypodermis. After this first 
step, the unfused midbody P(3-8).p cells each have the capacity to adopt a 

30 vulval cell fate (Sternberg et aL, Cell 44: 761-72, 1986). In a second step, 
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however, only three of these cells, P(5-7).p, adopt such fates in which they 
undergo three rounds of division to generate seven or eight descendants. P3.p, 
P4.p and P8.p adopt non-vulval fates, typically dividing only once to generate 
two descendants that eventually fuse with the syncytial hypodermis. The 

5 decision to adopt vulval cell fates occurs during the L2 and early L3 larval 
stages and is followed by cell divisions and differentiation in the L3 and L4 
larval stages, respectively (Sternberg et al., Cell 44: 761-72, 1986; Ferguson et 
al., Nature 326: 259-67, 1987). While mutations in class C synMuv genes 
alone cause mild defects, when a class C gene mutation is combined with either 

10 a class A. or.class3.mutation,_the twojmutations synergize to produce more 
severe vulval induction and other developmental defects. Class C synMuv 
genes, trr-1, hat- J, and epc-1, encode homologs of the transcriptional 
coactivator TRRAP, the MYST family acetyltransferases TIP60 and Esalp and 
the Drosophila Enhancer of Polycomb (E(Pc)) protein, respectively. Because 

15 of the predicted acetyltransferase activity of the HAT- 1 protein and because 
orthologs TRRAP and E(Pc) family proteins have been copurified in histone 
acetyltransferase complexes, we propose that a combination of histone 
acetyltransferase and histone deacetylase activities is required to properly 
specify vulval cell fates in C. elegans. 

20 

trr-1 interacts with class A and class B synMuv mutations 

We performed a genetic screen for synMuv mutants in a Un-15A(n767) 
background and identified six mutations in our pool of isolates that failed to 
complement each other and that defined the gene trr-1. To quantitate the 
25 synMuv phenotype in these mutants, we scored the number of cells that were 
induced to become vulva. 

To more precisely quantitate the Muv phenotype of trr-1; lin-15A 
strains, we scored the numbers of P(3-8).p cells induced per animal and found 
that all strains had a similarly penetrant, temperature-sensitive hyperinduced 
30 phenotype (Table 5A). 
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Table 5 trr-1 mutations cause a hyperinduced phenotype 



A. trr-1 interactions with synMuv mutations 





Temp 


Ave. # P(3-8).p 


ft / * . , n l n 

% animals 






(°C) 


induced (±SE) 


hyperinduced 


n 




20 


3.00 (±0) 


0 


31 


Un-15A(n767) 


20 


3.00 (±0) 


0 


24 


Hn-38fn751) 


20 


3.00 (±0) 


0 


27 


tir-lfa3630)' lin-15A(n767) 


20 


4.52 (±0.15) 


82 


45 


trr-1 (n3637)' lin-15A(n767) 


20 


4.52 (±0.14) 


83 


54 


trr-1 (n37Q4Y lin-15A(n767) 


20 


4.20 (±0.13) 


79 


43 


~trf2lM7GR) : Hn-15A(n767F 


20" 


'4:7'1'(±0.14) " 


"92 ' • 


36 


trr UnlJOQ)* Hn-1 t )Afn767) 


20 


4.81 (±0.13) 


95 


39 


trr-1 (nil 12)' Hn-15A(n767) 


20 


4.07 (±0.12) 


74 


54 


Hn- 1 5 A fn767) • trr-1 (KNAi) 


20 


5.60 (±0.08) 


100 


44 


trr Unl71?) lin-3Rfn751) 


20 


4.14 (±0.23) 


79 


14 


Hn %Rfn7 ID- trr-1 fltNAi) 


20 


5.66 (±0.08) 


100 


32 


wiiu typo 


15 


3.00 (±0) 


0 


29 


lin-1 5A(n767) 


15 


3.00 (±0) 


0 


32 


trr-1 (n37 04); lin-lDA(n/o/) 


i < 

i j 


3 1 3 (± 0 051 


21 


24 


trr-l(n3712); Un-15A(n767) 


15 


3.06 (± 0.03) 


13 


32 


wild-type 


25 


3.00 (±0) 


0 


36 


lin-15A(n767) 


25 


3.02 (±0.02) 


3.6 


28 


trr-l(n3704); lin-15A(n767) 


25 


5.87 (±0.06) 


100 


38 


trr-l(n3712); lin-15A(n767) 


25 


5.47 (±0.14) 


100 


17 
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B. trr-1 single mutants 





Temp 


Ave. # P(3-8).p 


% animals 




Genotype 


(°C) 


induced (±SE) 


hyperinduced 


n 


wild-type 


20 


3.00 (±0) 


0 


31 


trr-1(n3630) 


20 


3.03 (± 0.02) 


6.1 


33 


trr-1(n3637) 


20 


3.08 (±0.04) 


13 


30 


trr-1(n3704) 


20 


3.01 (±0.01) 


2.6 


39 


trr-1 (n3708) 


20 


3.05 (±0.03) 


8.1 


37 


trr-1(n3709) 


20 


3.03 (±0.02) 


6.3 


32 


trr-1(n3712) 


20 


3.10 (±0.03) 


13 


89 


trr-1 (RNAi) 


20 


3.09 (±0.05) 


13 


32 


wild-type 


15 


3.00 (±0) 


0 


29 


trr-1 (n3704) 


15 


3.08 (± 0.05) 


12 


26 


trr-1(n3712) 


15 


3.06 (± 0.03) 


12 


25 


wild-type 


25 


3.00 (±0) 


0 


36 


trr-1(n3704) 


25 


3.04 (±0.03) 


3.9 


51 


trr-1(n3712) 


25 


3.07 (±0.03) 


13 


48 



The number of P(3-8).p cells induced was scored as described below. 
Induction was scored after raising strains at the indicated temperature for two 
5 generations, trr-1 mutant homozygotes were scored by examining the non-Gfp 
progeny of trr-l/mlnl [dpy-10(el28) mlsUJ heterozygous parents. 

The hyperinduction we observed occurred in P3.p 5 P4.p and P8.p to 
similar extents. To determine if trr-1 interacted with other class A synMuv 
genes, we constructed a trr-l(n3712) lin-38 double mutant. These double 
10 mutant animals were also hyperinduced (Table 5A), suggesting that trr-1 

functions in parallel not only to lin-15A, but to the class A synMuv pathway in 
general. 

We also isolated trr-1 (n37 12) and the other trr-1 mutations away from 
any other synMuv mutations. Nearly all class A and class B synMuv single 
15 mutants adopt a wild-type pattern of P(3-8).p fates (Table 5B), however trr-1 
adults had a weakly penetrant hyperinduced phenotype (Table 5B). By 
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examining the cell fates adopted by individual P(3-8).p cells in L4 animals, we 
determined that the vulval cell-fate transformations of trr-1 single mutants 
always occurred in P8.p (Figure 5). In addition to ectopic vulval cell-fate 
transformations, all trr-1 mutations caused slow growth and sterility, although 
5 some mutant animals occasionally produced a small number of eggs (<1 0, as 
compared to -300 for the wild-type), all of which died during embryogenesis. 

To determine if tir-1 interacts with class B synMuv genes, we 
constructed double mutant strains containing trr-1 (n37 12) and mutations of 
class B synMuv genes. Interestingly, double mutant strains combining 

10 trr-1 (n37 12) with mutations of Un-ISB, lin-35 Rb, and lin-37 showed a 
significant increase in the penetrance of P8.p transformation (Figure 6). In 
addition to the increase in P8.p transformation, we occasionally observed 
ectopic transformations of P3.p and P4.p. Since Hn-15B(n744), Hn-35(n745) 
and lin-37 (n7 58) are strong loss-of-function and possibly null mutations of 

15 their corresponding genes, these results indicate that trr-1 functions 
redundantly with at least a subset of class B synMuv genes. 

No significant increase was observed in trr-l(n3712); Hn-36(n766) 
double mutants (Figure 6). By various genetic criteria, this loss-of-function 
lin-36 mutation behaves unlike mutations in other class B synMuv genes 

20 (Hsieh et al., Genes Dev 13: 2958-70, 1999; Fay et al, Genes Dev 16: 503-17, 
2002). There are at least two possibilities to explain the unusual behavior of 
Hn-36(n766). First, the lack of enhancement could be allele specific, with the 
Un-36(n766) mutation disrupting a function that is redundant with a class A 
synMuv function but not disrupting a separable lin-36 function that is 

25 redundant with trr-1 activity. Alternatively, our observations with lin-36 could 
reflect a gene-specific lack of enhancement. For example, the strength of the 
lin-36 defect may not be equivalent to that of other class B synMuv gene 
defects such that lack of lin-36 activity may be readily observable in a class A 
synMuv background but, unlike other class B synMuv defects, not observable 
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in a trr-1 background. Enhancement tests using additional lin-3 6 alleles will 
help to resolve this issue. 

trr-1 encodes a protein similar to mammalian TRRAP 

5 We mapped trr-1 to a small region of LGII and cloned the gene using 

transformation rescue as detailed below. To confirm the identity of trr-1 , we 
obtained a partial cDNA and, using RNA derived from this cDNA, found that 
RNA-mediated interference (RNAi) of this gene caused a highly penetrant 
hyperinducedphenotype in lin-15A and lin-38 mutant backgrounds (Table 5). 

10 As determined by RT-PCR and 5' RACE, the trr-1 gene.consists of 22. exons, 
four of which are alternatively spliced (Figure 7A). Since the sites of 
alternative splicing are separated by only six or nine nucleotides, the most 
exclusive (4054 amino acids) and inclusive (4064 amino acids) isoforms differ 
slightly in size. The genomic sequence of trr-1 is shown in Figure 8. The 

1 5 sequence of the trr-1 open reading frame is shown in Figure 9. 

The deduced amino acid sequence of TRR-1 is shown in Figure 10. The 
predicted TRR-1 proteins are similar to mammalian myc-associated protein 
TRRAP (transformation/transcription domain-associated protein) and its yeast 
homolog Tralp throughout most of their lengths (McMahon et al., Cell 94: 

20 363-74, 1 998; McMahon et al., Cell 94: 3 63-74, 1998; Saleh et al., J Biol Chem 
273: 26559-65, 1998). TRRAP and Tralp are similarly large proteins, 
extending 3828 and 3744 amino acids, respectively. The largest predicted 
TRR-1 isoform is 25 percent identical to TRRAP and 19 percent identical to 
Tralp. TRR-1, TRRAP, and Tralp share limited regions of homology with 

25 other proteins (Figure 7B). One of these regions is located at the carboxy 

terminus and is similar to the catalytic domains of ATM and PI-3-like kinases. 
Interestingly, the DXXXXN (SEQ ID NO:29) and DFG motifs critical for 
kinase activity are not present in TRR-1, TRRAP, or Tralp (Hunter et al., Cell 
83: 1-4, 1995). Instead of having an enzymatic function, this domain of 

30 TRRAP has been proposed to mediate protein-protein interactions (McMahon 
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et al., Cell 94: 363-74, 1998). All six tir-1 mutations introduce nonsense 
codons (Figure 7B). trr-l(n3637) is predicted to truncate the protein just prior 
to the ATM/PI-3 kinase-like domain. The phenotypic strength of trr- 1 (n3 63 7) 
is similar to that of other alleles, suggesting that deletion of the ATM/PI-3 
5 kinase-like domain alone results in a severe loss of protein function. Finally, 
tir-l(n3630), trr-l(n3637), and trr-l(n3712) introduce amber stop codons, and 
we observed that the sterility associated with these alleles was reduced by the 
sup-5(el464) informational suppressor tRNA mutation. This suppression, 
along with the partially penetrant sterility caused by trr-1 (RNAi), confirms that 
.10 the sterility observedin trr-1 mutants is truly due ..to loss of trr-1 function. 

trr-l(RNAi) is synthetically lethal with mutations in lin-35 Rb and other 
class B synMuv genes 

tir-l(RNAi) caused more severe phenotypic consequences than did trr-1 
15 mutations. For example, the ectopic induction phenotype of lin-15A; 

trr-1 (RNAi) mutants was much stronger than that of trr-1; lin-15A mutant 
strains (Table 5). We do not believe this difference is reflective of a partial loss 
of gene function caused by all of the trr-1 mutations. Instead we propose that 
at least some of the mutations cause a severe loss of gene function and that the 
20 difference is due to an effect of trr-1 (RNAi) on maternally-provided gene 
activity. In support of this proposal, trr-1 (n3704)/mnDf87; lin-15A and 
trr-1 (n3712)/mnDf87; lin-15A mutants that were severely deficient in 
zygotically-provided trr-1 activity but retained maternally-provided trr-1 
activity had phenotypic penetrances that were similar to those of trr-1; lin-15A 
25 homozygotes and were weaker than those of lin-15A; trr-1 (RNAi) mutants. 
Also arguing that trr-1; lin-15A homozygotes have significantly reduced 
zygotically-provided trr-1 gene activity, the protein truncations predicted by 
trr-1 (n37 04) and other trr-1 mutations are likely to remove functional domains 
and compromise TRR-1 activity. 
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We further characterized the effects of trr-l(RNAi). In wild-type and 
class A synMuv genetic backgrounds, trr-l(KNAi) caused retarded growth, 
adult sterility and weakly penetrant embryonic and larval lethalities (Table 6). 
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Table 6 trr-l(RNAi) is synthetically lethal with class B but not with class A 

synMuv mutations 



Total % lethality 



Genotype 


% dead emDryos 


0/ A an A T 1 tarviiA 

/o oeau juj. Jdi vdc 


\ U J 


wild-type 


0 


0 


ft C\C\tV)} 


trr-l(RNAi) 


o.o 




7 R (T)6) 


hn-15A(n767) 


u 


n 

U 




hn-38(n751) 


0.1 


n 
u 


V, I v 1 WJ ) 


hn-15B(n744) 


A o 
0.2 


A 

u 




hn-35(n745) 


0.6— 


A 1 . 

O.z " 


U.O v *tO^ 


1 • m S / *J S A?\ 

hn-36(n766) 


A 1 
O.J 


A 

u 




dpl-l(n2994) 


1 A 

14 


1 1 
1.1 


1^ 1 /9fi^ 


lm-15A(n767); trr- 


3.2 


A O 




l(RNAi) 








hn-38(n751); trr- 


3.8 


1 1 




l(RNAi) 








hn-15B(n744); trr- 


oz.5 


JO.U 




l(RNAi) 








Un-35(n745); trr- 


66.2 


33.8 


100 (263) 


l(RNAi) 








Un-36(n766); trr- 


19.4 


21.6 


41.0(444) 


J(RNAi) 








dpl-l(n2994); trr- 


45.1 


53.6 


98.7 (304) 


l(RNAi) 









Animals injected with trr~J dsRNA were individually plated 10-15 



5 hours following injection. Injected animals were subsequently transferred to 
new plates every 24 hours until egg laying had ceased. Dead embryos and 
larvae on a plate were counted at least two days after eggs were laid. All of the 
mutant strains in which trr-l(RNAi) was performed are homozygous viable. 

Interestingly, trr-l(RNAi) caused highly penetrant embryonic and larval 

10 lethalities in combination with many class B synMuv mutations. Most of the 
dead embryos arrested at the late embryonic pretzel stage and those that 
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hatched died shortly thereafter. We have not yet determined a basis for this 
lethality. It is important to note that many of the class B synMuv mutations 
tested are predicted to have severe effects on their cognate class B synMuv 
proteins. Since trr-l(RNAi) can synthetically interact with strong reduction-of- 
5 function or null class B synMuv mutations, these data indicate that trr-1 

functions redundantly with class B synMuv genes not only in vulval cell-fate 
determination but also in an essential process earlier in development, 

tir-l(RNAi) causes synthetic lethality in a Un-36(n766) background 
although the penetrance of this lethality is not as high as in other class B 

1 0 synMuv backgrounds. This assay therefore unmasks a redundancy between 
tir-1 and lin-36 that we did not observe in the P8.p induction assay. As 
discussed above, the strength of the lin-36 defect may not be equivalent to the 
strengths of defects of other class B synMuv genes. This difference in 
strengths may explain why, relative to other class B synMuv genes, lin-36 

15 shows weaker interactions with trr-1 in terms of synthetic lethality and 
synthetic P8.p induction. 

trr-1 synthetically interacts with dpl-1 DP 

Mammalian TRRAP and yeast Tralp are thought to function as 

20 coactivator proteins that bridge transcription factors to histone 

acetyltransferases (McMahon et al., Cell 94: 363-74, 1998; Brown et al., 
Science 292, 2333-7, 2001). Based on coimmunoprecipitation and functional 
assays, E2F transcription factors were linked to TRRAP (McMahon et al., Cell 
94: 363-74, 1998; Lang et al., J Biol Chem 276: 32627-34, 2001). In vivo E2F 

25 and DP family proteins form heterodimers that are bound by Rb family proteins 
via a direct interaction with the E2F subunit reviewed by (Dyson, Genes Dev 
12: 2245-62, 1998; (Trimarchi et al., Nat Rev Mol Cell Biol 3: 1 1-20, 2002). 
We previously determined that one of two C. elegans E2F family members, 
efl-1, and the sole DP family member, dpl-U are class B synMuv genes Ceol et 

30 al., Mol Cell 7: 461-73 (2001). As noted above, lin-35 Rb was also 
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characterized as a class B synMuv gene, and the LIN-35 Rb protein was found 
to form a complex with DPL-1 and EFL-1 in vitro (Lu et al. Cell 95: 981-91, 
1998; Ceol et al, Mol Cell 7: 461-73, 2001). 

LIN-35 Rb and Rb proteins in other species are thought to recruit histone 
5 deacetylase complexes to regulate E2F-dependent transcription 

(Brehm et al. Nature 391: 597-601, 1998; (Luo et al. Cell 92, 463-73, 1998; 
Magnaghi-Jaulin et al. Nature 391: 601-5, 1998). Coupling these results with 
our genetic finding that trr-1 acts redundantly with lin-35 Rb to negatively 
regulate vulval induction, one might speculate that EFL-1 and DPL-1 recruit 
10 .distinct LIN-35-containing.and TRR-1 -containing complexes to appropriately 
regulate vulval cell fate determination. To examine this possibility, we wished 
to determine if trr-1 acted through efl-1 and dpl-1 to negatively regulate vulval 
development. 

Without being tied to a particular theory, three lines of evidence suggest 
1 5 that tjr-J does not act solely through transcription factors, efl-1 and dpl-1; first, 
the ectopic induction of P8.p in dpl-1 trr-1 double mutants is greater than that 
observed in either single mutant (Figure 6). Because of the sterility conferred 
by the dpl-l(n3316) null and trr-l(n3712) mutations, these mutants were 
derived from dpl-l(n3316) trr-l(n3712) / ++ mothers. It is notable that in this 
20 test we substantially reduced maternally-provided dpl-1 activity by injecting 
mothers with dpl-1 dsRNA and scoring dpl-1 (n3316 RNAi) trr-1 (n3712) 
progeny; second, in a weak lin-15A mutant background at 15°C, trr-1 (RNAi) 
greatly enhanced the ectopic induction observed in dpl-1 mutant animals that 
were derived from dpl-1 heterozygous mutant mothers (Table 7); 

25 

Table 7 trr-1 acts redundantly with dpl-1 

Ave. # P(3-8).p induced 
Genotype (±SE) % animals mutant (n) 

Un-15A(n433); trr-1 (RNAi) 3.17 (±) 20(15) 

dpl-1 (n3316); lin-15A(n433) 3.00 (±0) 0(35) 
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dpl-l(n3316); Un-15A(n433); 4.98 (±) 89 (45) 

trr-l(RNAi) . 

Animals were raised at 15°C, a temperature at which dpl-l(n33J6); lin-15A(n433) mutants do not show 
hyperinduction. dpl-l(n3316) homozygous mutants were recognized as the Unc non-Gfp progeny of 
dpl-l(n3316) unc-4(el20)/mln][dpy-]0(el28) mJsNJ heterozygous parents. 

5 third, when performed in a homozygous dpl-1 mutant background, trr-1 (RNAi) 
caused synthetic lethality with dpl-1 (Table 6). Since viable trr-1 (RNAi) dpl-1 
progeny could be derived from heterozygous, but not homozygous dpl-1 
mutant mothers, this synthetic lethality apparently required a lack of 
maternally-provided dpl-1 activity. These results indicate that trr-1 does not 

1 0 act only through dpl-1 to regulate vulval development and embryonic and 
larval viability. Although all of these assays were conducted in dpl-1 mutant 
backgrounds, we expect that, since reduction of dpl-1 function is predicted to 
affect all C. elegans DP/E2F activity, these results similarly apply to efl-1. 

In addition to these data, one other observation argues against the model 

1 5 that tir-1 acts solely through dpl-1 . Whereas double mutants containing 

Un-35(n745), a putative null allele of lin-35, and trr-l(nS712) display highly 
penetrant ectopic induction of P8.p, the ectopic induction in dpl-1 (n33 16 KNAi) 
mutants is relatively weak (Figure 6). If both lin-35 and trr-1 were acting 
solely through dpl-1, defects of equivalent strengths would be expected. 

20 

The Muv phenotype of trr-1 mutants requires let-60 Ras pathway activity 

Previous studies determined that a conserved Ras pathway induces 
vulval development in C. elegans reviewed by (Sternberg et al., Trends Genet 
14: 466-72, 1998). Loss-of-function mutations affecting genes in this pathway 
25 cause a vulvaless (Vul) phenotype characterized by P(3-8).p adopting 

hypodermal instead of vulval cell fates. To determine if Ras pathway activity 
is required for the trr-1 mutant phenotype, we constructed strains in which the 
functions of tir-lJin-15A and a Ras pathway gene were reduced. The 
uninduced phenotype caused by let-23 receptor tyrosine kinase and let-60 Ras 
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mutations was epistatic to the hyperinduced phenotype caused by tir-1 and 
Un-15A loss of function (Table 8). 
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Table 8 trr-1 epistasis with let-23 RTK, let-60 Ras and lin-3 EGF 



Genotype 


Ave. #P(3-8).p induced 
(±SE) 


% animals 
hyperinduced 


n 


wild-type 


3.00 (±0) 


0 


31 


Un-15A(n767) 


3.00 (±0) 


0 


24 


Un-15A(n767); trr-l(RNAi) 


5.60 (±0.08) 


100 


A A 

44 


let-23(sy97); Un-15A(n767) 


0.02 (±0.02) 


0 


28 


let-23(sy97); Hn-15A(n767); trr- 


0.05 (±0.03) 


0 


A ^ 

42 


l(RNAi) 








let-60(nl876); Un-15A(n767) 


0(±0) 


0 


17 


let-60j(nl876);lin-15A(n767); trr-^ . 


oc±P).. 


0 


23 


J(RNAi) 








Un-3(n378); lin-15A(n767) 


0.30 (±0.07) 


0 


40 


Un-3(n378); Un-15A(n767); trr- 


4.35 (±0.20) 


, 85 


20 


HRNAQ 









let-23(sy97) homozygous mutants were recognized as Rol Unc non-Gfp progeny oirol-6(el87) let- 
23(sy97) unc-4(el20)/mlnl[dpy-10(el28) mlsl4]; lin-15A(n767) heterozygous parents, and let- 
5 60(nl876) homozygous mutants were recognized as Unc progeny of let-23(nl876) unc-22(e66)/nTl; 
+MT1; Un-15A(n767) heterozygous parents. 



These results indicate that Ras pathway activity is required to produce the 
trr-1; lin-15A Muv phenotype. By contrast, trr-1; lin-3; lin-15A triple mutants 

10 showed a wild-type level of induction in P(5-7).p and ectopic induction in P3.p, 
P4.p and P8.p. lin-3 encodes an EGF-like protein that is produced by the 
gonadal anchor cell and is thought to act non-cell autonomously to stimulate 
Ras pathway activity in P(5-7).p (Hill et al., Nature 358: 470-6, 1992).. These 
findings suggest that a basal level of Zm-3-independent Ras pathway activity, 

15 when combined with mutations in trr-1 and lin-15A, is sufficient to induce 
vulval cell fates in P(3-8).p. 

hat-1 and epc-1, but not ssl-1, loss of function phenocopies trr-1 

TRRAP and Tralp are components of protein complexes that acetylate 
20 histones (Allard et al., Embo J 18: 5108-19, 1999; reviewed by Brown et al., 
Trends Biochem Sci 25:15-9, 2000). These complexes are distinguished by 
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their histone acetyltransferase subunits: the mammalian TFTC and p/CAF and 
the yeast SAGA complexes contain Gcn5 family acetyltransferases, whereas 
the mammalian TEP60 and the yeast NuA4 complexes contain MYST family 
acetyltransferases. 

5 To determine if TRR-1 might function with a histone acetyltransferase 

in C. elegans, we used RNA-mediated interference to inactivate such genes. 
Whereas inactivation of a Gcn5 homolog Y47G6A.6 had no effect, inactivation 
of a MYST family gene we have named hat-1 produced a highly penetrant 
Muv phenotype in a lin-15A background. To further characterize hat-1, we 

10 isolated a deletion allele, n4075, that removes 1010 base pairs from the hat- J 
locus and is predicted to produce a protein that contains the first 35 amino acids 
of HAT-1 followed by 52 unrelated amino acids prior to termination (Figure 
11 A). The genomic nucleic acid sequence of hat-1 is shown in Figure 12, The 
nucleic acid sequence of the hat-1 open reading frame is shown in Figure 13. 

15 The predicted full-length HAT-1 protein is 458 amino acids long, and this 
deletion is expected to remove the conserved chromodomain and 
acetyltransferase catalytic domain (Figure 1 IB). The amino acid sequence of 
the wild-type HAT-1 protein is shown in Figure 14. hat-1 (n4075) mutants 
exhibited the same spectrum of phenotypes and genetic interactions as trr-1 

20 mutants, hat-1 (n4075) single mutants were slow growing and sterile. In 
combination with class A synMuv mutations, hat-1 (n4075) caused a severe 
Muv phenotype characterized by P3.p, P4.p and P8.p ectopic induction (Table 
8). Alone, hat-1 (n4075) caused ectopic induction of P8.p (Figure 11C). In 
combination with a lin-15B mutation, the penetrance of this ectopic induction . 

25 was greatly increased (Figure 1 ID), 

The TIP60 and NuA4 complexes contain other proteins in addition to 
MYST family acetyltransferases. We inactivated C. elegans genes encoding 
homologs of these proteins and identified epc-1 as a negative regulator of 
vulval induction. The genomic sequence of epc-1 is shown in Figure 16. The 

30 nucleic acid sequence of the epc-1 open reading frame is shown in Figure 17. 
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epc-1 encodes a homolog of the Drosophila Enhancer of Polycomb (E(Pc)) 
protein and similarly named mammalian and yeast proteins. The deduced 
amino acid sequence of EPC-1 is shown in Figure 18. Aside from their 
association with MYST family histone acetyltransferases, little is known about 

5 the molecular interactions of E(Pc)-like proteins. Inactivation of epc-1 caused 
fully penetrant embryonic lethality in the broods of animals injected with RNA. 
To study the effects of epc-1 inactivation during postembryonic development, 
we injected epc-1 RNA into RNAi-deficient hermaphrodites and subsequently 
mated these animals with RNAi-competent males, a procedure referred to as 

10 .".zygotic JRKAi!' (Herman, Development 128: 58 1-90, 2001). Formany genes 
that act during multiple stages of development, this scheme has been shown to 
provide sufficient gene activity for embryonic functions, but inadequate gene 
activity for postembryonic functions, epc-1 (RNAi) performed in this manner 
did not affect vulval induction in wild-type animals, but produced a Muv 

1 5 phenotype in lin-15A and lin-38 mutant backgrounds (Table 9). 

Table 9 hat-1 and epc-1 but not ssl-1 loss of function phenocopies trr-1 loss 

of function 



Genotype 


Ave. # P(3-8).p 
induced (±SE) 


% animals 
mutant 


n 


wild-type 


3.00 (±0) 


0 


31 


Hn-15A(n767) 


3.00 (±0) 


0 


24 


Un-38(n751) 


3.00 (±0) 


0 


27 


lin-15B(n744) 


3.00 (±0) 


0 


20 


hat-l(n4075) 


3.15 (±0.08) 


15 


20 


hat-l(n4075); lin-15A(n767) 


3.76 (±0.14) 


76 


25 


hat-l(n4075); Un-15B(n744) 


3.71 (±0.10) 


77 


31 


rde-l/+; epc-1 (RNAi) 


3.00 (±0) 


0 


65 


rde-l/+; Un-15A(n767); epc-l(RNAi) 


3.32 (±0.10) 


36 


33 


lin-38(n751); rde-l/+; epc-1 (RNAi) 


3.29 (±0.02) 


31 


65 


rde-l/+; Un-l5B(n744); epc-l(RNAi) 


3.03 (±0.02) 


4.2 


48 
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rde-l/+- ssl-1 (RNAi) 


3.00 (±0) 


0 


37 


rde-J/+; lin-15A(n767); ssl-l(BNAi) 


3.00 (±0) 


0 


42 


rde-l/+; Un-15B(n744); ssl-l(RNAi) 


3.01 (±0.01) 


2.9 


70 



hat-l(n4075) homozygous mutants were recognized as the non-Unc progeny of +MTln754; hat- 
l(n4075)/nTln754 heterozygous parents. Since RNAi of epc-1 and ssl-1 using standard methods 
causes highly penetrant embryonic lethality, we performed "zygotic RNAi" as described below. 



5 A low percentage of P8.p induction was observed in a lin-15B background. 
We recently obtained a deletion allele that removes 886 bases from the epc-1 
locus, including the third and fourth epc-1 exons (Figure 5A). If the second 
exon were spliced to the fifth exon, a 137 amino acid protein would be 
produced that contains the first 109 amino acids of the 795 amino acid 

10 predicted EPC-1 protein. Preliminary studies indicate that epc-l(n4076) 
homozygotes are sterile and, with respect to vulval induction, show genetic 
interactions similar to those of epc-1 (RNAi), t?r-l and hat-1 mutants. 

TRRAP copurified with the p400 protein as part of the mammalian 
TIP60 and p400 complexes (Fuchs et al., Cell 106: 297-307, 2001). The p400 

15 complex was isolated based on its interaction with the adenovirus El A 

oncoprotein and was also shown to associate with c-myc. The p400 protein 
itself is a member of the SWI2/SNF2 family of proteins, and, like many 
SWI2/SNF2 family members, was shown to possess ATPase activity. We 
identified a C. elegans homolog of p400, which we named ssl-1 (ssl 9 

20 SWI2/SNF2-like). ssl-1 genomic sequence and the predicted SSL-1 protein 
product are shown in Figure 19. Figure 16B shows the nucleotide positions of 
the predicted exons with respect to ssl-1 genomic sequence. The cDNA 
sequence of ssl-1 is shown in Figure 20. The deduced protein sequence is 
shown in Figure 21. The function of ssl-1 was studied by RNAi. ssl-1 (RNAi) 

25 caused an embryonic lethal phenotype reminiscent of that caused by 

epc-1 (RNAi). In both cases, dead embryos generally arrested just prior to 
morphogenesis and apparently lacked the hypodermal ridge that is a 
characteristic of enclosed embryos. We are currently characterizing this 
phenotype further. "Zygotic" RNAi of ssl-1, using the same procedure as 
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described above, caused no vulval defects in wild-type, lin-15A, or lin-15B 
genetic backgrounds. These results suggest that ssl-1 may act with epc-1 in an 
essential embryonic process. 

5 trr-1 acts redundantly with lin-35 Rb to antagonize let-60 Ras signaling 

Identifying factors involved in cell fate determination is important for 
understanding how cells that contain the same genomic information can adopt 
different cell fates during animal development. As they help to distinguish 
P3.p, P4.p and P8.p from P(5-7).p, trr-1, hat-1, and epc-1 are such cell fate 

1 0 .determination genes.. Given their molecular identities, trr-1, hat-1, ..mdjsp.c-1 
likely act at the level of transcription, either in an instructive or permissive 
fashion, to create differences in gene expression in P3.p, P4.p and P8.p as 
compared to P(5-7).p. 

Many of the pathways involved in regulating cell fate determination are 

1 5 conserved. In many cases, pathways that control cell fate determination in 

model organisms has been shown to regulate cellular proliferation in mammals. 
Pathways that regulate vulval cell fate specification in G elegans provide clear 
examples. A conserved let-60 Ras pathway induces vulval cell fates, and this 
pathway is antagonized by the class B lin-35 Rb pathway, trr-1, and likely hat- 

20 1 and epc-1, act in parallel to lin-35 Rb to negatively regulate let-60 Ras 

pathway signaling. These comparisons suggest that mammalian counterparts 
of trr-1, hat-1, and epc-1 may similarly act in parallel to Rb and antagonize 
Ras in the control of cell proliferation. 

25 trr-1, hat-1, and epc-1 likely share a common function 

The vulval phenotypes and genetic interactions of trr-1, hat-1, and epc-1 
mutants are strikingly similar. In light of the copurification of their 
mammalian and yeast counterparts, these data strongly suggest that TRR-1, 
HAT-1, and EPC-1 proteins function as part of a protein complex. To 

30 conclusively demonstrate such an interaction, strains containing mutations in 
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two of these genes will be constructed. If these mutants are acting in the same 
complex, one would not expect to observe synergism in double mutants. In 
addition, protein-protein interaction studies will be performed. This complex 
containing putative complex members, trr-1, hat-1, and epc-1 were the only 
5 candidates we identified by RNAi. It is possible that these three genes encode 
an indispensable core of a putative HAT complex that associates with other 
proteins whose functions are dispensable for proper vulval development. The 
large size of TRR-1 may require it to be divided into fragments to perform 
protein interaction studies. 

10 

hat-1 mutants likely have defects in histone acetylation 

The best studied MYST family acetyltransferases are the yeast Esalp 
and mammalian TIP60 proteins. Esalp was found to preferentially acetylate 
histone H4 (Smith et al., Proc Natl Acad Sci USA 95: 3561-5, 1998; Clark et 

15 al., Mol Cell Biol 19: 2515-26, 1999; Suka et al., Mol Cell 8: 476-9, 2001) 

Furthermore, depletion of Esalp resulted in global reduction of the acetylation 
of H4 and, to a lesser extent, of other nucleosomal histones (Reid et al., Mol 
Cell 6, 1297-307, 2000; Suka et al., Mol Cell 8: 476-9, 2001). HAT-1 function 
is assayed using commercially available antisera that specifically recognize 

20 acetylated isoforms of histones to determine whether hat-1 mutants have gross 
defects in histone acetylation. Differences in acetylation between hat-1 
mutants and wild-type animals is determined by whole-mount staining of fixed 
animals or by chromatin immunoprecipitation. 

25 Putative HAT complex function 

Histone acetyltransferases have been characterized as transcriptional 
coactivators (reviewed by Roth et al., Biochem 70:81-120, 2001), and TRRAP 
and its yeast homolog Tralp are proposed to bridge interactions between 
activation domains of DNA-binding transcription factors and histone 
30 acetyltransferases (Brown et al., Science 292, 2333-7, 2001). Therefore, a 
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putative TRR-l/EPC-l/HAT-1 complex may function in transcriptional 
activation (Figure 22). If so, one would expect it to activate genes that 
negatively regulate vulval development. 

While most data support the link between acetylation and activation, 

5 additional observations suggest that at least some histone acetylation may be 
important for gene silencing. For example, loss-of-function mutations that 
affect the MYST family acetyltransferases Sas2p and Sas3p cause defects in 
silencing of mating type loci and telomeres in yeast (Reifsnyder et al., Nat 
Genet 14:42-9, 1996; Ehrenhofer-Murray et al., Genetics 145:923-34, 1997). 

10 Sas2p and Sas3p are proposed to acetylate newly-deposited nucleosomes, and 
the modified acetyllysine residues they create are thought to be important for 
establishing silencing following DNA replication (Meijsing et al., Genes Dev 
15: 3169-82, 2001; Osada et al. Genes Dev 15:3155-68, 2001). These residues 
may include acetyllysine 16 on histone H4, which is implicated in mating type 

15 loci and telomeric silencing in yeast (Johnson et al., Embo J 11: 2201-9, 1992; 
Meijsing et al., Genes Dev 15: 3169-82, 2001). Other acetylated histone 
isoforms are prevalent in silent chromatin. For instance, Drosophila 
heterochromatin is enriched in acetyllysine 12 of histone H4 (Turner et al., Cell 
69: 375-84, 1992). Just as a MYST family histone acetyltransferase is linked 

20 to silencing, loss-of-function studies in Drosophila indicate a role for E(Pc) in 
transcriptional repression. E(Pc) mutations synergize with polycomb group 
mutations to strongly derepress homeobox genes and act alone as suppressors 
of variegation to derepress genes that are juxtaposed to heterochromatin (Sato 
et al., Genetics 105: 357-70, 1983; Sinclair et al., Genetics 148: 211-20, 1998). 

25 These observations allow us to consider the possibility that HAT-1 , in 

association with TRR-1 and EPC-1, may normally downregulate transcription 
(Figure 22). By this model, one would expect a putative TRR-l/EPC-l/HAT-1 
complex to silence genes that are required for vulval cell fates. Because we do 
not know the relevant targets of TRR-l/EPC-l/HAT-1, we cannot distinguish 

30 between transcriptional activating versus repressing models at this time. 
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Putative TRR-l/EPC-l/HAT-1 complex DNA targeting 

Their coimmunoprecipitation and cooperation in reporter gene 
activation suggest that mammalian TREAP can be targeted by E2F proteins to 
DNA (McMahon et al., Cell 94: 363-74, 1998; (Lang et a\.,JBiol Chem 276: 

5 32627-34, 200 1). We investigated the possibility of TRR-1 targeting by 

DP/E2F heterodimers by studying genetic interactions between trr-1 and dpl-L 
dpl-1 is the only DP family member in C. elegans and therefore loss of dpl-1 
activity is expected to effectively reduce all DP/E2F heterodimer function in 
the organism, dpl-1 synthetically interacted with trr-1 in vulval induction and 

1 0 viability assays. It is especially relevant that we observed synergism in some 
of these assays when using dpl-1 (n33 16 RNAi) mutants, which are severely 
compromised for dpl-1 function. These results combined with the observation 
that the defects of trr-1 single mutants are stronger than those of dpl-1 single 
mutants suggest that trr-1 acts only partially or not at all through dpl-1. If not 

1 5 only through DPL-1, how might a putative TRR-l/EPC-l/HAT-1 complex be 
targeted to DNA? Studies in yeast indicate that the TRRAP homolog Tralp 
directly interacts with acidic activation domains of transcription factors (Brown 
et al., Trends Biochem Sci 25: 15-9, 2000). TRR-1 may similarly be targeted to 
DNA by transcription factors other than DPL-1. The assays we have used to 

20 characterize trr-1 provide a means of identifying and evaluating candidate 
transcription factors and other proteins that may function with TRRAP family 
members in targeted histone acetylation. 

The experiments described in Example II were carried out as described below. 

25 

Strains and genetics 

Strains were cultured as described by (Brenner, Genetics 77: 71-94, 
1974), and maintained at 20°C unless otherwise specified. Bristol N2 was used 
as the wild-type strain. The following mutations were used: LGI: Un-35(n745); 
30 LGH: dpy-10(el28), let-23(sy97), rol-6(el87), dpl-l(n2994, n3316) (Chapters 
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2, 3), unc~4(el20), trr-l(n3630, n3637, H3704, n3708, n3709, n3712) (This 
study), mex-l(it9), Un-38(n751); LGIII: lon-l(el85), sup-5(el464), 
Un-36(n766), lin-37 (n7 58); LGTV: Un-3(n378), let-60(nl876) (Beitel et al., 
Nature 348: 503-9, 1990); LGV: dpy-U(e224), rde-l(ne219) 
5 (Tabara et al., Cell 99: 123-32, 1999); LGX: Hn-15B(n744), Un-15A(n767, 
n433) (Ferguson et al., Genetics 123: 109-21, 1989) and, unless otherwise 
noted, are described in (Riddle et al., C. elegans II (Cold Spring Harbor, New 
York, Cold Spring Harbor Laboratory Press, 1997). The deficiencies mnDjVO 
and mnDJ87 (Sigurdson, et al., Genetics 108: 331-45, 1984), translocation nil 
.10 . . n7.54 (TV;V) (Ferguson etal., Genetics 110: 17-72, 1985),.and chromosomal 
inversion mlnl[dpy-10(el28) mlsl4] (Edgley et al., Mol Genet Genomics 
266:385-95, 2001), were also used. mlsl4, an integrated transgene linked to 
the chromosomal inversion mini, consists of a combination of GFP- 
expressing transgenes that allow mlsl 4-containing animals to be identified 
15 beginning at the 4-cell stage of embryogenesis (Edgley et al., Mol Genet 
Genomics 266:385-95, 2001). 

P(3-8).p induction assay 

In the wild-type, P(5-7).p adopt vulval fates in which they divide during 

20 the L3 larval stage to generate seven or eight descendants. P3.p, P4.p and P8.p 
adopt non-vulval fates, typically dividing once to generate two descendants that 
fuse with the hypodermis. Induction was scored in L4 hermaphrodites using 
Nomarski DIC microscopy by counting the number of descendants produced 
by individual P(3-8).p cells. Different scores, 1, 0.5 and 0 cells induced, were 

25 assigned to cells that were fully, partially or not induced, respectively. 

Partially induced P(3-8).p cells have one daughter that produces a complement 
of induced descendants while the other daughter fails to divide. 
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trr-1 cloning 

We mapped trr-1 to an interval on LGII between the right endpoint of the 
deficiency mnDf90 and the mex-1 gene. To clone the trr-1 gene, we performed 
transformation rescue as described by (Mello et al., Embo J 10: 3959-70, 
5 1991), using the pRF4plasmid (80 ng/nL) as a coinjection marker. We 

rescued the trr-1 Muv and sterile phenotypes by injecting the cosmid C47D12 
(lOng/uL) into trr-1 (n37 12)/ mini [dpy-10(el28) mlsUJ; Un-15A(n767) 
mutants and isolating Rol non-Gfp transgenic lines, trr-1 corresponds to the 
predicted gene C47D12.1. 

10 

RNAi analyses 

Templates for in vitro transcription reactions were made by PCR 
amplification of either cDNAs and their flanking T3 and T7 promoter 
sequences or coding exons from genomic DNA using T3 and T7-tagged 

15 oligonucleotides. In vftro-transcribed RNA was annealed and injected as 
described by (Fire et al., Nature 391: 806-1 1, 1998). 
In addition to the genes described above, we injected RNA corresponding to 
C. elegans genes that encode homologs of the TRRAP complex proteins 
TTP48/TAP54a (G elegans predicted gene T22D1.1), TIP49/TAP54 

20 (C27H6.2), Eaf3p (Y37D8A.9), p33ING (Y51H1A.4), and AF-9 (M04B2.3) 

(Loewith et al., Mol Cell Biol 20: 3807-16, 2000; Eisen et al., J Biol Chem 276: 
3484-91, 2001; Fuchs et al., Cell 106: 297-307, 2001; Nourani et al., Mol Cell 
21: 7629-40, 2001; Gavin et al., Nature 415: 141-7, 2002; Ho et al, Nature 
415: 180-3, 2002). We did not observe vulval lineage defects after injection of 

25 these RNAs into either wild-type or synMuv single mutant backgrounds. 

Lastly, bacteria designed to express double-stranded RNA corresponding to the 
Gcn5 homolog Y47G6A.6 (Fraser et al, Nature 408: 325-30, 2000) were fed to 
wild-type and synMuv single mutant hermaphrodites. As described below, we 
did not observe vulval defects following this treatment. 

30 
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Deletion allele isolation 

Genomic DNA pools from mutagenized worms were screened for 
deletions essentially as described by (Plasterk et al., Nat Genet 17: 119-21, 
1997). Deletion mutant animals were isolated from frozen stocks and were 
5 backcrossed four times prior to use. hat-l(n4075) removes nucleotides +106 to 
+1115, epc-l(n4076) nucleotides +2014 to +2899 and ssl-J (n4077) nucleotides 
+5075 to +5757 of genomic DNA relative to their respective predicted 
translational start sites. 

10 cDNA.isolation 

We used TITAN ONE-TUBE RT-PCR (Roche Diagnostics, Pleasanton, 

California) to carry out RT-PCR and recovered trr-1 and hat-1 cDNA clones. 

Existing cDNAs were obtained from the C. elegans EST project to. determine 

gene structures ofepc-1, the trr-1 3' end and the ssl-1 5' end. We used 5' 
15 RACE (5' RACE System v2.0, GIBCO) to determine the 5' ends and SL1 

trans-spliczd leader sequences of trr-1, hat-1, and epc-1 transcripts. 

Allele sequence 

We used PCR-amplified regions of genomic DNA as templates in 
20 determining mutant allele sequences. For each allele investigated, we 
determined the sequences of all exons and splice junctions of the gene in 
question. All mutations were confirmed by determining the sequence of 
independently-derived PCR products. All sequences were determined using an 
automated ABI 373 DNA sequencer (Applied Biosystems). 

25 

Example III 

ssl-1, a p400 SWI/SNF ATPase homolog, acts redundantly with lin-15B 

TRRAP is a component of the mammalian p400 complex, which 
contains the p400 SWI/SNF family protein and was identified based on its 
30 interaction with the adenovirus El A oncoprotein (Fuchs et al., Cell 106: 297- 
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307, 2001). Although Tip60 was not present in the purified p400 complex/the 
Tip60 and p400 complexes share many of the same components and more 
recent analyses have indicated that p400 and Tip60 can copurify as part of a 
large p400/Tip60 multisubunit complex (Frank et al., EMBO Rep., 4:575-80, 
5 2003). 

As discussed in Example II, the ssl-1 (ssl, SWI/SNF-like) gene encodes 
a homolog of the p400 protein. RNAi of ssl-1 using standard methods caused 
fully penetrant embryonic lethality like that observed with epc-l(RNAi). 
zygotic RNAi of ssl-1, performed as described above, did not cause defects in 

10 vulval development in either class A or class B synMuv backgrounds. In 

further studies, we isolated a deletion mutation, n4077, that removes a portion 
of the fifth ssl-1 exon. ssl-1 (n4077) is predicted to encode a truncated protein 
containing the first 540 amino acids of the 1671 amino acid SSL-1 protein and 
two unrelated amino acids, ssl-1 (n4077) homozygotes were partially sterile 

1 5 and produced a few inviable embryos, but were not defective in vulval 

development. ssl-l(n4077); Un-15A(n767) mutants were likewise not defective 
in vulval development, however, ssl-1 (n4077); lin-15B(n744) mutants often 
expressed an ectopic vulval cell fate in P8.p. ssl-1 (n4077) likely causes a 
stronger reduction in gene activity than does ssl-1 zygotic RNAi, and this 

20 stronger reduction unmasks a redundancy between ssl-1 and lin-15B. 

trr-1; hat-1, trr-1; epc-1 and trr-1; ssl-1 double mutants do not show 
synthetic defects in vulval development 

Whereas synthetic defects in double mutants imply genetic redundancy, 
25 the lack of synthetic defects in double mutants can indicate that two genes act 
in the same genetic pathway. Based on the similar phenotype and genetic 
interactions of trr-1, hat-1 and epc-1 mutants and on the copurification of the 
proteins encoded by their mammalian and yeast counterparts, we hypothesized 
that trr-1, hat-1 and epc-1 act together to regulate vulval development. To test 
30 this possibility, we constructed double mutants to determine if hat-1 and epc-1 

80 



WO 2004/024084 



PCT/US2003/028626 



function redundantly with trr-L We measured the numbers of vulval cell fates 
in trr-l(n3712); hat-1 (n3681), trr-l(n3712); hat-l(n4075), and trr-1 (n37 12); 
epc-l(RNAi) mutants and found that the extent of vulval development observed 
in these double mutants was similar to that observed in single mutant animals. 

5 These results suggest that hat- J and epc-1 act in the same genetic pathway as 
nr-i, which by analogy to the class A and class B lin-35 Rb synMuv pathways, 
we have named the class C synMuv pathway. 

trr-1; ssl-1 double mutants, and, as described above, ssl-1; lin-15A 
mutants were not synthetically defective in P(3-8).p cell-fate specification. It is 

1 0 possible thztssU has both class C and class A synMuv activities, however, 
additional considerations suggest that ssl-1 has properties more like those of a 
class C gene. For instance, ssl-1; synmuvB mutants have a defect limited to 
P8.p, whereas synmuvA; synmuvB mutants typically show ectopic vulval cell 
fates in P3.p, P4.p and P8.p. In addition, ssl-1 mutants are sterile, and sterility 

15 has not been observed for any class A synMuv gene (Thomas et al., 

Development 126: 3449-59, 1999). These considerations, along with the 
copurification of the mammalian SSL-1 and HAT-1 counterparts, p400 and 
Tip60, suggest that ssl-1 is an atypical class C gene, one that acts redundantly 
with class B, but not class A synMuv genes. 

20 

trr-1, hat-1, epc-1 and ssl-1 act redundantly with the lin-35 Rb pathway to 
antagonize let-60 Ras signaling 

Identifying genes involved in cell-fate determination is important for 
understanding how cells that contain the same genomic information can adopt 
25 different fates during animal development. As they help to distinguish P3 .p, 
P4.p and P8.p from P(5-7).p, trr-1, hat-1, epc-1 and ssl-1 are such cell-fate 
determination genes. 

In many cases, pathways that control cell-fate determination and cell 
division in invertebrates have been shown to regulate similar processes in 
30 mammals. Pathways that regulate vulval cell-fate specification in C. elegans 
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provide clear examples. A conserved let-60 Ras pathway induces vulval cell 
fates, and this pathway is antagonized by an at least partially conserved class B 
lin-35 Rb pathway, tir-1, hat-1, epc-1 and ssl-1 act in parallel to lin-35 Rb and 
other genes in this pathway to negatively regulate let-60 Ras signaling. We 

5 suggest that the mammalian counterparts of trr-U hat-1, epc-1 and ssl-1 may 
similarly act in parallel to Rb and antagonize Ras in the control of cell-fate 
determination and cell division. It is interesting to note that the p400 complex 
and Rb-containing complexes are targeted by the adenovirus El A oncoprotein 
(Whyte et ah, Nature 334:124-9, 1988; Fuchs et al., Cell 106: 297-307, 2001). 

10 Our finding regarding ssl-1 redundancy with a lin-35 Rb pathway gene 
suggests that El A may act in mammals by perturbing the activities of 
functionally redundant p400 and Rb-containing complexes. 

Identification of new class B synMuv genes 

15 On the basis of genetic interactions, the synMuv genes have been 

grouped into three classes A, B and C. For an animal to show vulval 
abnormalities, genes representing two of three classes must be dysfunctional. 
The class B synMuv genes include genes that encode homologs of the 
mammalian Rb tumor suppressor protein and other proteins that act with Rb in 

20 regulating cell-fate specification and division in mammals. We have recently 
discovered three new class B synMuv genes: Hn(n3628), Hn(n4256), and lin- 
65. Hn(n3628) encodes a protein similar to the yeast Set2 histone 
methyltransferase. The nucleic acid and amino acid sequences of Hn(n3628) 
are shown in Figures 23 and 24, respectively. Hn(n4256) encodes a protein 

25 similar to yeast and mammalian SUV39H1 family histone methyltransferases. 
The nucleic acid and amino acid sequences of Un(n4256) are provided in 
Figures 25 and 26. lin-65 encodes a protein rich in acidic amino acids. The 
nucleic acid and amino acid sequences of lin-65 are provided in Figures 27 and 
28. 
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The striking parallel between the Rb pathway in mammals and the Rb- 
related pathway we have identified in worms suggests that further 
characterization of the synthetic Multivulva genes will provide insights into 
how cell proliferation is regulated in humans. Because synMuv genes encode 

5 members .of a conserved tumor suppressor pathway that antagonizes a 

conserved Ras oncogene pathway, the class B synMuv genes are likely to be 
important in understanding cancer progression in mammals. Provided with the 
human genome sequence, standard methods can be used to identify mammalian 
orthologs of newly-identified synMuv genes. Such homologs may act as tumor 

10 . .suppressors or oncogenes iiLmammals. Genetic enhancer or suppressor screens 
may be perfomed to identify new genes which may function in or interface 
with this Rb-related pathway. Furthermore, using methods described herein, 
drug screens can be used to identify compounds that affect cell proliferation. 
Compounds that block the Muv phenotype of synMuv mutant animals are 

1 5 likely to be useful antitumor agents for the treatment of a mammalian 
neoplasia. 

Compounds that stimulate cell division in animals with a single, silent 
synMuv mutation are likely to be agonists of cell proliferation and may act in a 
manner analogous to growth factors. Such compounds are useful in the 
20 treatment of a subject in need of increased cell proliferation, for example, in a 
subject that has a disorder characterized by increased cell death, such as 
Alzheimer's disease, Huntington's disease, stroke, Parkinson's disease, 
myocardial infarction or congestive heart failure. 

25 Identifying synMuv targets [***Craig: please confirm that this paragraphs 
reflects our discussion of the screens***] 

The targets of synMuv biological activity, for example, genes that are 
transcriptionally regulated by a synMuv nucleic acid or polypeptide, are 
identified using a variety of genetic and molecular approaches. While target 

i 

30 identification is discussed below for the class B synMuvs, similar approaches 
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are used to identify the targets of the class C synMuvs or other transcriptional 
regulatory systems. 

At least two genetic screens can be used to identify class B synMuv 
targets. Both screens are based on the premise that the class B synMuv 

5 proteins negatively regulate transcription. Given that class B synMuv proteins 
are likely to negatively regulate transcription, one would postulate that the Muv 
phenotype of synMuv mutants is due to the ectopic expression of class B 
targets. Loss of function mutations in such targets likely suppressthe synMuv 
phenotype. In one example, a simple F2 suppression screen is used to identify 

.10 such, targets. _m fact, such screens.have identified Class Bsuppressor mutations 
that may affect such genes. Many of the isolates from these screens are as yet 
uncharacterized. 

In a second example, which would likely identify genes whose 
expression is negatively regulated by the class B synMuvs, mutagenized class 

1 5 A synMuv F 3 animals are screened for a Muv phenotype. Dominant mutations 
expected from this screen might affect regulatory sequences bound by synMuv 
proteins and lead to ectopic expression of the target gene in question. 
Mutations of this type have been shown to affect the expression of egl-J, a 
gene that promotes programmed cell death in C. elegans. These egl-l(gf) 

20 mutations disrupt a binding site for the TRA-1 transcriptional repressor protein, 
leading to ectopic egl-1 expression in the hermaphrodite specific neurons and 
subsequent programmed cell death (Conradt et al. Cell 98:317-27, 1999). 

Because transcription factors typically target multiple genes, loss of 
function of one target may not suppress the phenotype caused by a 

25 transcriptional repressor loss of function or, alternatively, recapitulate the 

phenotype caused by transcriptional activator loss of function. Such challenges 
are overcome by performing screens in a particularly sensitized genetic 
background so as to allow the observation of a small effect that may be caused 
by loss of one target. For example, in one of the screens described above, the 

30 Muv phenotype caused by a temperature-sensitive lin-15AB allele was 
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suppressed. A similarly sensitized background may be used for to carry out F 2 
suppression and F! synMuv screens. 

Various molecular approaches involving microarrays are also useful in 
identifying synMuv targets. In the simplest experiment, expression profiles of 
5 synMuv mutants are compared to the wild type. A comparison of synMuv 
double mutant to the wild type can be problematic because these animals have 
different amounts of vulval tissue. The generation of vulval tissue likely 
involves the differential regulation of many genes, only a subset of which 
might be direct targets of synMuvs. Alternatively, a synMuv single mutant can 

1 0 b.e. compared to. a Aarild-rtype aontroL This_approach may not succeed if two 
classes of synMuvs must lose function in order for transcription to be 
differentially regulated. If mutations in two classes of synMuvs are desired, an 
appropriate comparison may, for example, be that of a synMuvA; synMuvB; 
let-60 Ras triple mutant versus a let-60 Ras single mutant. These animals 

15 would fulfill the requirements of having the same amount of vulval tissue and 
disabling two classes of synMuvs. Alternatively, chromatin 
immunoprecipitation (ChIP) combined with microarray analysis may be used. 
For example, in a preparation of proteins crosslinked to DNA, DPL-1 or EFL-1 
could be immunoprecipitated, the crosslink reversed and the resultant DNA 

20 amplified and applied to microarrays. Such microarray experiments described 
above may identify synMuv targets that could be compared to putative let-60 
Ras pathway targets as previously determined by microarray analyses 
(Romagnolo et al., Dev Biol 247:127-36, 2002). Determining this interface is 
clearly an important issue as Rb and Ras pathways antagonize each other not 

25 only in C. elegans, but also during cell cycle progression in cultured 

mammalian cells (Mittnacht et al., Curr Biol. 7:219-21, 1997; Peeper et al., 
Nature. 386:177-81, 1997). 
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Do the synMuv genes act by regulating cell cycle progression? 

Many studies of Rb and E2F in mammals have focused on the roles of 
these proteins in cell cycle regulation. Might the class B synMuv genes, and 
possibly other classes of synMuv genes regulate vulval development through 

5 direct regulation of P(3-8).p cell cycles? While not being tied to a particular 
theory, the following observations support this possibility. For example, P3.p, 
P4.p, and P8.p undergo extra cell divisions in synMuv mutants. Additionally, 
mutations in a subset of class B synMuv genes that includes dpi- J, efl-1, and 
lin-35 Rb have been shown to partially suppress the S phase and cell division 

10 defectsxausedby RNA-mediated interference, of the.C elegans cyclinj) . 
homolog cyd-1 (Boxem et al., Curr Biol. 12:906-1 1, 2002). There are other 
aspects of these observations that complicate a strict cell cycle regulation 
model. First, not only are there extra P3.p, P4.p and P8.p cell divisions in 
synMuv mutants, but there are also various changes in the differentiation of 

15 P3.p, P4.p and P8.p descendants in synMuv mutants. The synMuv genes 
therefore appear to regulate a cell fate decision, a component of which is the 
decision to progress through the cell cycle. Studies of Rb in mammals have 
indicated that Rb may have a role in halting cell cycle progression and 
stimulating differentiation during myogenesis (reviewed by Kitzmann Cell Mol 

20 Life Sci. 58:571-9, 2001). Second, whereas dpl-1, efl-1, and lin-35 Rb 

mutations can partially suppress defects caused by cyd-1 (BNAi), mutations in 
other class B synMuv genes cannot (Boxem et al, Curr Biol. 12:906-1 1, 2002). 
This observation suggests that, if the class B synMuv genes are cell cycle 
regulators, some of them act in a tissue-specific fashion, for example in P(3- 

25 8).p but not in the intestinal cells that were monitored in cyd-1 (RNAi) studies. 
Monitoring cell cycle progression in P3.p, P4.p and P8.p will address these 
issues. 

The identification of synMuv transcriptional targets will enable us to 
identify their mammalian orthologs. Such targets are promising clinical targets 
30 for chemotherapeutics for the treatment of neoplasia. In addition, the 
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identification of synMuv protein-protein interactions is useful in screening for 
chemotherapeutic drugs that modulate such interactions. 
Identification of Additional Mammalian Orthologs 

Because the Rb and RAS pathways are conserved between mammals 

5 and C. elegans, the powerful genetics and genomics of C. elegans can be 
exploited, as described herein, for the systematic identification of mammalian 
genes that correspond to C. elegans genes identified according to methods 
described herein. Such genes include mammalian orthologs of synMuv class 
B, and class C genes and their transcriptional targets. 

10 Protein sequencesjcorresponding to.genes of interest areretrieved from 

the repositories of C. elegans sequence information at the wormbase web site. 
The C. elegans protein or nucleic acid sequence is then used for standard 
[BLASTP] or [tblastn] searching using the NCBI website. The protein 
sequence corresponding to the top mammalian candidate produced by tblastn is 

1 5 retrieved from Genbank and is used for BLASTp search of C. elegans proteins 
using the wormbase website. These methods allow us to identify mammalian 
orthologs of worm genes revealed by our genetic analysis. 

An ortholog is a protein that is functionally related to a reference 
sequence. Such orthologs might be expected to functionally substitute for one 

20 another. For example, expression of a mammalian ortholog of a C. elegans 
gene, when expressed in a worm having a mutation in the C. elegans gene, 
might be expected to partially or completely rescue the worm phenotype. 

RNAi in mammalian cell lines 

25 RNAi has been used extensively to deplete mRNAs in mammalian cell 

culture (Elbashir et al., Nature 411:494-8, 2001). Mammalian orthologs of 
class C synMuv genes can be identified using RNAi, for example, in 
mammalian cultured cells. Briefly, an inhibitory nucleic acid is introduced into 
a mammalian cell having a mutation in a class A or class B synMuv gene, for 

30 example, by lipofection. Such cells are then assayed for increased levels of cell 
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proliferation relative to control cells not contacted with an inhibitory nucleic 
acid. An increased level of proliferation in mammalian cells contacted with the 
inhibitory nucleic acid identifies the corresponding target gene as a class C 
synMuv gene. 

5 

Microarrays 

The class B and class C genes described herein, are useful in identifying 
their transcriptional regulatory targets. Such targets may be identified using 
microarrays in combination with chromatin immunoprecipitation (chIP) as 
. 1 0 .described herein. . Such methods, are described in U.S. Patent 6,503,717, 

6,410,243, and 6,610,489, hereby incorporated by reference. A nucleic acid 
target of a class B or class C synMuv polypeptide will likely have a 
mammalian ortholog. Such an ortholog represents a promising target for the 
development of novel chemotherapeutics for the treatment of a neoplasia. 

15 The array elements, which are preferably derived from the C. elegans 

genome, are organized in an ordered fashion such that each element is present 
at a specified location on the substrate. Useful substrate materials include 
membranes, composed of paper, nylon or other materials, filters, chips, glass 
slides, and other solid supports. The ordered arrangement of the array elements 

20 allows hybridization patterns and intensities to be interpreted as expression 
levels of particular genes or proteins. Methods for making nucleic acid 
microarrays are known to the skilled artisan and are described, for example, in 
U.S. Patent No. 5,837,832, Lockhart, et al. (Nat. Biotech. 14:1675-1680, 1996), 
and Schena, et al. (Proc. Natl. Acad. Sci. 93:10614-10619, 1996), herein 

25 incorporated by reference. Methods for making polypeptide microarrays are 
described, for example, by Ge (Nucleic Acids Res. 28:e3.i-e3.vii, 2000), 
MacBeath et al., (Science 289:1760-1763, 2000), Zhu et al.( Nature Genet. 
26:283-289), and in U.S. Patent No. 6,436,665, hereby incorporated by 
reference. 

30 
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Nucleic acid microarrays 

To produce a nucleic acid microarray oligonucleotides may be 
synthesized or bound to the surface of a substrate using a chemical coupling 
procedure and an ink jet application apparatus, as described in PCT application 

5 W095/25 1116 (Baldeschweiler et al.), incorporated herein by reference. 
Alternatively, a gridded array may be used to arrange and link cDNA 
fragments or oligonucleotides to the surface of a substrate using a vacuum 
system, thermal, UV, mechanical or chemical bonding procedure. 

A nucleic acid molecule (e.g. RNA or DNA) derived from a biological 

10 sample, such as a cultured cell,..a . tissue specimen, or .other source, may be used 
to produce a hybridization probe as described herein. The mRNA is isolated 
according to standard methods, and cDNA is produced and used as a template 
to make complementary RNA suitable for hybridization using standard 
methods. The RNA is amplified in the presence of fluorescent nucleotides, and 

1 5 the labeled probes are then incubated with the microarray to allow the probe 
sequence to hybridize to complementary oligonucleotides bound to the 
microarray. 

Incubation conditions are adjusted such that hybridization occurs with 
precise complementary matches or with various degrees of less 

20 complementarity depending on the degree of stringency employed. For 

example, stringent salt concentration will ordinarily be less than about 750 mM 
NaCl and 75 mM trisodium citrate, preferably less than about 500 mM NaCl 
and 50 mM trisodium citrate, and most preferably less than about 250 mM 
NaCl and 25 mM trisodium citrate. Low stringency hybridization can be 

25 obtained in the absence of organic solvent, e.g., formamide, while high 

stringency hybridization can be obtained in the presence of at least about 35% 
formamide, and most preferably at least about 50% formamide. Stringent 
temperature conditions will ordinarily include temperatures of at least about 
30°C, more preferably of at least about 37°C, and most preferably of at least 

30 about 42°C. Varying additional parameters, such as hybridization time, the 
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concentration of detergent, e.g., sodium dodecyl sulfate (SDS), and the 
inclusion or exclusion of carrier DNA, are well known to those skilled in the 
art. Various levels of stringency are accomplished by combining these various 
conditions as needed. In a preferred embodiment, hybridization will occur at 
5 30°C in 750 mM NaCl, 75 mM trisodium citrate, and 1% SDS. In a more 
preferred embodiment, hybridization will occur at 37°C in 500 mM NaCl, 50 
mM trisodium citrate, 1% SDS, 35% formamide, and 100 \ig/m\ denatured 
salmon sperm DNA (ssDNA). In a most preferred embodiment, hybridization 
will occur at 42°C in 250 mM NaCl, 25 mM trisodium citrate, 1% SDS, 50% 

1 0 Jbrmamide, and 200 pg/ml ssDNA. Useful-variations, on these. conditions will 
be readily apparent to those skilled in the art. 

The removal of nonhybridized probes may be accomplished, for 
example, by washing. The washing steps that follow hybridization can also 
vary in stringency. Wash stringency conditions can be defined by salt 

15 concentration and by temperature. As above, wash stringency can be increased 
.by decreasing salt concentration or by increasing temperature. For example, 
stringent salt concentration for the wash steps will preferably be less than about 
30 mM NaCl and 3 mM trisodium citrate, and most preferably less than about 
15 mM NaCl and 1.5 mM trisodium citrate. Stringent temperature conditions 

20 for the wash steps will ordinarily include a temperature of at least about 25°C, 
more preferably of at least about 42°C, and most preferably of at least about 
68°C. In a preferred embodiment, wash steps will occur at 25°C in 30 mM 
NaCl, 3 mM trisodium citrate, and 0.1% SDS. In a more preferred 
embodiment, wash steps will occur at 42°C in 15 mM NaCl, 1.5 mM trisodium 

25 citrate, and 0.1% SDS. In a most preferred embodiment, wash steps will occur 
at 68°C in 15 mMNaCl, L5 mM trisodium citrate, and 0.1% SDS. Additional 
variations on these conditions will be readily apparent to those skilled in the 
art. 

A detection system may be used to measure the absence, presence, and 
30 amount of hybridization for all of the distinct sequences simultaneously (e.g., 
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Heller et al., Proc. Natl. Acad. Sci. 94:2150-2155, 1997). Preferably, a scanner 
is used to determine the levels and patterns of fluorescence. 

Protein Microarrays 
5 Families of proteins, such as those encoded by the genes described 

herein, or their orthologs, may be analyzed using protein microarrays. Such 
arrays are useful in high-throughput low-cost screens to identify peptide or 
candidate compounds that bind a polypeptide of the invention, or fragment 
thereof. Typically, protein microarrays feature a protein, or fragment thereof, 

1 0 , bound to a solid support. . Suitable solid supports include membranes (e.g.,. 
membranes composed of nitrocellulose, paper, or other material), polymer- 
based films (e.g., polystyrene), beads, or glass slides. For some applications, 
proteins (e.g., polypeptides encoded by class B or class C synMuv gene or 
antibodies against such polypeptides) are spotted on a substrate using any 

15 convenient method known to the skilled artisan (e.g., by hand or by inkjet 
printer). Preferably, such methods retain the biological activity or function of 
the protein bound to the substrate 

The protein microarray is hybridized with a detectable probe. Such 
probes can be polypeptide, nucleic acid, or small molecules. For some 

20 applications, polypeptide and nucleic acid probes are derived from a biological 
sample taken from a patient, such as a a homogenized tissue sample (e.g. a 
tissue sample obtained by biopsy); or cultured cells (e.g., lymphocytes). 
Probes can also include antibodies, candidate peptides, nucleic acids, or small 
molecule compounds derived from a peptide, nucleic acid, or chemical library. 

25 Hybridization conditions (e.g., temperature, pH, protein concentration, and 

ionic strength) are optimized to promote specific interactions. Such conditions 
are known to the skilled artisan and are described, for example, in Harlow, E. 
and Lane, D., Using Antibodies : A Laboratory Manual. 1998, New York: Cold 
Spring Harbor Laboratories. After removal of non-specific probes, specifically 

30 bound probes are detected, for example, by fluorescence, enzyme activity (e.g., 
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an enzyme-linked colorimetric assay), direct immunoassay, radiometric assay, 
or any other suitable detectable method known to the skilled artisan. 

Screening Assays 

5 As discussed above, C. elegans class B and class C synMuv genes and 

their encoded proteins function in chromatin remodeling and antagonize the 
RAS pathway. Given that mechanisms for controlling mammalian cell cycle 
regulation and C. elegans vulval development are highly conserved, C. elegans 
and components of the C. elegans synMuv pathway are useful in screening 
. 1 0 methods ibr.chemotherapeutics. and for. the identification, of novel clinical 
targets. 

Compounds that modulate the function of a Class B, or Class C synMuv 
nucleic acid or of their encoded proteins are likely to be useful in treating 
neoplasias. Based on this discovery, screening assays may be carried out to 

1 5 identify compounds that modulate the action of a polypeptide or the expression 
of a nucleic acid sequence of the invention. Such compounds are useful in 
treating a neoplasia. The method of screening may involve high-throughput 
techniques. In addition, these screening techniques may be carried out in 
cultured mammalian cells or in animals (e.g., nematodes). 

20 Any number of methods are available for carrying out such screening 

assays. In one working example, candidate compounds are added at varying 
concentrations to the culture medium of cultured cells expressing one of the 
nucleic acid sequences described herein. Gene expression is then measured, 
for example, by standard Northern blot analysis (Ausubel et al., supra) or RT- 

25 PCR, using any appropriate fragment prepared from the nucleic acid molecule 
as a hybridization probe. The level of gene expression in the presence of the 
candidate compound is compared to the level measured in a control culture 
medium lacking the candidate molecule. A compound that promotes a 
decrease in the expression of a nucleic acid sequence disclosed herein or a 

30 functional equivalent is considered useful in the invention; such a molecule 
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may be used, for example, as a therapeutic to delay or ameliorate human 
diseases associated with neoplasia or inappropriate cell cycle regulation. Such 
cultured cells include nematode cells (for example, C. elegans cells), 
mammalian, or insect cells. 
5 In another working example, the effect of candidate compounds may be 

measured at the level of polypeptide production using the same general 
approach and standard immunological techniques, such as Western blotting or 
immunoprecipitation with an antibody specific for a polypeptide of the 
invention. For example, immunoassays may be used to detect or monitor the 
1 0 expression of at least one of the polypeptides of the invention in an organism. 
Polyclonal or monoclonal antibodies (produced by standard techniques) that 
are capable of binding to such a polypeptide may be used in any standard 
immunoassay format (e.g., ELISA, Western blot, or RIA assay) to measure the 
level of the polypeptide. A compound that promotes a decrease in the 
15 expression of the polypeptide is considered particularly useful. Again, such a 
molecule may be used, for example, as a therapeutic to ameliorate neoplasia. 

In one example, candidate compounds are screened for those that 
specifically bind to and antagonize a synMuv B or synMuv C polypeptide. 
Such an interaction can be readily assayed using any number of standard 
20 binding techniques and functional assays (e.g., those described in Ausubel et 
al., supra). For example, a candidate compound may be tested in vitro for 
interaction and binding with a polypeptide of the invention and its ability to 
modulate the cell cycle or decrase cell proliferation may be assayed by any 
standard technique (e.g., a C. elegans synMuv assay). 
25 In one particular working example, a candidate compound that binds to 

a polypeptide may be identified using a chromatography-based technique. For 
example, a recombinant polypeptide of the invention may be purified by 
standard techniques from cells engineered to express the polypeptide (e.g., 
those described above) and may be immobilized on a column. A solution of 
30 candidate compounds is then passed through the column, and a compound 
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specific for the polypeptide is identified on the basis of its ability to bind to the 
polypeptide and be immobilized on the column. To isolate the compound, the 
column is washed to remove non-specifically bound molecules, and the 
compound of interest is then released from the column and collected. 

5 Compounds isolated by this method (or any other appropriate method) may, if 
desired, be further purified (e.g., by high performance liquid chromatography). 
In addition, these candidate compounds may be tested for their ability to cause 
cell death using any assay known to the skilled artisan. Compounds isolated by 
this approach may also be used, for example, as therapeutics to delay or 

1 0 . ameliorate human .diseases, associated with neoplasia. Compounds that are 
identified as binding to polypeptides of the invention with an affinity constant 
less than or equal to 10 mM are considered particularly useful in the invention. 

Potential antagonists include organic molecules, peptides, peptide 
mimetics, polypeptides, nucleic acids, and antibodies that bind to a nucleic acid 

1 5 sequence or polypeptide of the invention and thereby increase or decrease its 
activity. Potential antagonists also include small molecules that bind to and 
occupy the binding site of the polypeptide thereby preventing binding to 
cellular binding molecules, such that normal biological activity is prevented. 
Each of the DNA sequences provided herein may also be used in the 

20 discovery and development of therapeutic lead compounds. The encoded 
protein, upon expression, can be used as a target for the screening of 
therapeutics for the treatment of neoplasia. Additionally, the DNA sequences 
encoding the amino terminal regions of the encoded protein or Shine-Delgarno 
or other translation facilitating sequences of the respective mRNA can be used 

25 to construct antisense, dsRNAs, or siRNA sequences to control the expression 
of the coding sequence of interest. Such sequences may be isolated by standard 
techniques (Ausubel et al., supra). The antagonists of the invention may be 
employed, for instance, to delay or ameliorate human diseases associated with 
neoplasia. 
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Optionally, compounds identified in any of the above-described assays 
may be confirmed as useful in delaying or ameliorating human diseases 
associated with neoplasia or inappropriate cell cycle regulation in either 
standard tissue culture methods or animal models and, if successful, may be 
5 used as therapeutics for the treatment of neoplasia. 

Small molecules of the invention preferably have a molecular weight 
below 2,000 daltons, more preferably between 300 and 1,000 daltons, and most 
preferably between 400 and 700 daltons. It is preferred that these small 
molecules are organic molecules. 

.10 

Test Compounds and Extracts 

In general, compounds capable of delaying or ameliorating human 
diseases associated with neoplasia are identified from large libraries of both 
natural product or synthetic (or semi-synthetic) extracts or chemical libraries 

1 5 according to methods known in the art. Those skilled in the field of drag 
discovery and development will understand that the precise source of test 
extracts or compounds is not critical to the screening procedure(s) of the 
invention. Compounds used in screens may include known compounds (for 
example, known therapeutics used for other diseases or disorders). 

20 Alternatively, virtually any number of unknown chemical extracts or 

compounds can be screened using the methods described herein. Examples of 
such extracts or compounds include, but are not limited to, plant-, fungal-, 
prokaryotic- or animal-based extracts, fermentation broths, and synthetic 
compounds, as well as modification of existing compounds. Numerous 

25 methods are also available for generating random or directed synthesis (e.g., 
semi-synthesis or total synthesis) of any number of chemical compounds, 
including, but not limited to, saccharide-, lipid-, peptide-, and nucleic acid- 
based compounds. Synthetic compound libraries are commercially available 
from Brandon Associates (Merrimack, NH) and Aldrich Chemical (Milwaukee, 

30 WT). Alternatively, libraries of natural compounds in the form of bacterial, 
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fungal, plant, and animal extracts are commercially available from a number of 
sources, including Biotics (Sussex, UK), Xenova (Slough, UK), Harbor Branch 
Oceangraphics Institute (Ft. Pierce, FL), and PharmaMar, U.S.A. (Cambridge, 
MA). In addition, natural and synthetically produced libraries are produced, if 
5 desired, according to methods known in the art, e.g., by standard extraction and 
fractionation methods. Furthermore, if desired, any library or compound is 
readily modified using standard chemical, physical, or biochemical methods. 

In addition, those skilled in the art of drug discovery and development 
readily understand that methods for dereplication (e.g., taxonomic 
1 0 . .dereplication, biological dereplication, and chemical dereplication, .or. any 
combination thereof) or the elimination of replicates or repeats of materials 
already known to function in neoplasia should be employed whenever possible. 

When a crude extract is found to decrease cell proliferation or to 
suppress a synMuv phenotype, further fractionation of the positive lead extract 
1 5 is necessary to isolate chemical constituents responsible for the observed effect. 
Thus, the goal of the extraction, fractionation, and purification process is the 
careful characterization and identification of a chemical entity within the crude 
extract that inhibits cell proliferation or suppresses a synMuv phenotype. 
Methods of fractionation and purification of such heterogenous extracts are 
20 known in the art. If desired, compounds shown to be useful agents to delay or 
ameliorate human diseases associated with neoplasia are chemically modified 
according to methods known in the art. 

Pharmaceutical Therapeutics 

25 The invention provides a simple means for identifying compositions 

(including nucleic acids, peptides, small molecule inhibitors, and mimetics) 
capable of acting as therapeutics for the treatment of a neoplastic disease. 
Accordingly, a chemical entity discovered to have medicinal value using the 
methods described herein is useful as a drug or as information for structural 

30 modification of existing compounds, e.g., by rational drug design. Such 
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methods are useful for screening compounds having an effect on a variety of 
diseases characterized by inappropriate cell cycle regulation. 

For therapeutic uses, the compositions or agents identified using the 
methods disclosed herein may be administered systemically, for example, 

5 formulated in a pharmaceutically-acceptable buffer such as physiological 

saline. Preferable routes of administration include, for example, subcutaneous, 
intravenous, interperitoneally, intramuscular, or intradermal injections that 
provide continuous, sustained levels of the drug in the patient. Treatment of 
human patients or other animals will be carried out using a therapeutically 

1 0 effective amount of a neoplastic disease therapeutic in a physiologically- 
' acceptable carrier. Suitable carriers and their formulation are described, for 
example, in Remington's Pharmaceutical Sciences by E.W. Martin. The 
amount of the therapeutic agent to be administered varies depending upon the 
manner of administration, the age and body weight of the patient, and with the 

1 5 clinical symptoms of the neoplastic disease. Generally, amounts will be in the 
range of those used for other agents used in the treatment of a neoplastic 
disease, although in certain instances lower amounts will be needed because of 
the increased specificity of the compound. A compound is administered at a 
dosage that controls the clinical or physiological symptoms of a neoplastic 

20 disease as determined by, for example, measuring tumor size, cell proliferation, 
or metastasis. 

Formulation of Pharmaceutical Compositions 

Administration of a compound may be by any suitable means that is 
25 effective for the treatment of a neoplastic disease. Generally, compounds are 
admixed with a suitable carrier substance, and are generally present in an 
amount of 1-95% by weight of the total weight of the composition. The 
composition may be provided in a dosage form that is suitable for oral, 
parenteral (e.g., intravenous, intramuscular, subcutaneous), rectal, transdermal, 
30 nasal, vaginal, inhalant, or ocular administration. Thus, the composition may 
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be in form of, e.g., tablets, capsules, pills, powders, granulates, suspensions, 
emulsions, solutions, gels including hydrogels, pastes, ointments, creams, 
plasters, drenches, delivery devices, suppositories, enemas, injectables, 
implants, sprays, or aerosols. The pharmaceutical compositions may be 
5 formulated according to conventional pharmaceutical practice (see, e.g., 
Remington: The Science and Practice of Pharmacy, (20th ed.) ed. A.R. 
Gennaro, 2000, Lippincott Williams & Wilkins, Philedelphia, PA. and 
Encyclopedia of Pharmaceutical Technology, eds. J. Swarbrick and J. C. 
Boylan, 1988-2002, Marcel Dekker, New York). 

. 10 

Other Embodiments 

From the foregoing description, it will be apparent that variations and 
modifications may be made to the invention described herein to adapt it to 
various usages and conditions. Such embodiments are also within the scope of 
15 the following claims. 

All publications mentioned in this specification are herein incorporated 
by reference to the same extent as if each independent publication was 
specifically and individually indicated to be incorporated by reference. 

20 What is claimed is: 
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Claims 

1 . A method for identifying a compound that treats a neoplasia, said 
method comprising the steps of: 

(a) contacting a cell comprising a mutation in a Class B synMuv gene 

5 selected from the group consisting of: mep-1, lin(n3628), Un(n4256), and lin-65 
and a second mutation in a synthetic multivulval gene, or an ortholog thereof, 
with a candidate compound; 

(b) detecting a phenotypic alteration in said contacted cell relative to a 
control cell; wherein a candidate compound that alters the phenotype of said 

10 . contactedcell relative tasaid control celLis.a_compQund.that treats a neoplasia. 

2. The method of claim 1 , wherein said cell is in a nematode. 

3 . The method of claim 2, wherein said phenotypic alteration is an 
15 alteration in a multivulval phenotype. 

4. The method of claim 2, wherein said phenotypic alteration is an 
alteration in sterility. 

20 5 . The method of claim 1 , wherein said synthetic multivulval gene 

is a synMuv class A gene. 

6. The method of claim 1, wherein said cell is an isolated 
mammalian cell. 

25 

7. The method of claim 1, wherein said phenotypic alteration is a 
decrease in cell proliferation. 
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8. A method for identifying a candidate compound that treats a 
neoplasia, said method comprising: 

(a) providing a cell having a mutation in a Class B synMuv gene 
selected from the group consisting of mep-1, Un(n3628), Un(n4256), and lin-65 

5 and having a second mutation in a synMuv nucleic acid or ortholog thereof; 

(b) contacting said cell with a candidate compound; and 

(c) detecting a decrease in proliferation of said cell contacted with said 
candidate compound relative to a control cell not contacted with said candidate 
compound, wherein a decrease in proliferation identifies said candidate 

10 compound as a candidate compound that treats a neoplasia. 

9. The method of claim 8, wherein said cell is in a nematode. 

10. The method of claim 9, wherein said decrease in proliferation is 
1 5 detected by detecting inhibition of a Muv phenotype. 

1 1 . The method of claim 8, wherein said cell has a mutation in Dp, 
E2F, or histone deaceytlase. 

20 12. The method of claim 8, wherein said cell is an isolated 

mammalian cell. 
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13. A method of identifying a compound that treats a neoplasia, said 

method comprising: 

(a) providing a cell expressing a nucleic acid having at least 95% 
identity to a Class B synMuv gene selected from the group consisting of: mep- 

5 /, Un(nS628), Un(n4256), and lin-65; 

(b) contacting said cell with a candidate compound; and 

(c) monitoring the expression of said nucleic acid, an alteration in the 
level of expression of said nucleic acid indicates that said candidate compound 
is a compound that treats a neoplasia. 

10 

14. The method of claim 1 3, wherein said gene comprises a reporter 

gene. 

15. The method of claim 13, wherein said reporter gene comprises 
1 5 lacZ, gfp, CAT, or luciferase. 

16. The method of claim 13, wherein said expression is monitored by 
assaying protein level. 

20 17. The method of claim 1 3, wherein said expression is monitored by 

assaying nucleic acid level. 

1 8. The method of claim 1 3 , wherein said cell is in a nematode. 

25 
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19. A method for identifying a candidate compound that treats a 
neoplasia, said method comprising: 

(a) providing a cell expressing a Class B synMuv gene selected from 
the group consisting of: mep-1, Un(n3628), Un(n4256), and lin-65\ 
5 (b) contacting said cell with a candidate compound; and 

(c) comparing the expression of said polypeptide in said cell contacted 
with said candidate compound to a control cell not contacted with said 
candidate compound, wherein an increase in the expression of said polypeptide 
identifies said candidate compound as a candidate compound that treats a 
10 -neoplasia.. 

20. The method of claim 19, wherein said cell is in a nematode. 

21 . The method of claim 19, wherein said expression is monitored 
15 with an immunological assay. 

22. A method for identifying a candidate compound that treats a 
neoplasia, said method comprising: 

(a) providing a cell expressing a Class B synMuv polypeptide selected 
20 from the group consisting of: MEP-1, LIN(n3628), LIN(n4256), and LIN-65, 

said method comprising; 

(b) contacting said cell with a candidate compound; and 

(c) comparing the biological activity of said polypeptide in said cell 
contacted with said candidate compound to a control cell not contacted with 

25 said candidate compound, wherein an increase in the biological activity of said 
polypeptide identifies said candidate compound as a candidate compound that 
treats a neoplasia. 

23. The method of claim 22, wherein said biological activity is 
30 monitored with an enzymatic assay. 
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24. The method of claim 22, wherein said biological activity is 
monitored with an immunological assay. 

5 25. The method of claim 22, wherein said biological activity is 

monitored with a nematode bioassay. 

26. A method of identifying a nucleic acid target of class B synMuv 
biological activity, said method comprising: 

10 (a) mutagenizing a C. elegans comprising mutations in a Class B 

synMuv gene selected from the group consisting of: mep-1, Hn(n3628), 
Un(n4256), and Hn-65 and in a Class A synMuv gene; 

(b) allowing said C. elegans to reproduce; and 

(c) selecting a C. elegans comprising a mutation that suppresses a 
15 synMuv phenotype; wherein said mutation identifies a nucleic acid target of 

class B synMuv biological activity. 

27. A method of identifying a nucleic acid target of class B synMuv 
biological activity, said method comprising: 

20 (a) providing a microarray comprising fragments of nematode nucleic 

acids; 

(b) contacting said microarray with detectably labeled nucleic acids 
derived from a nematode comprising a mutation in a Class B synMuv gene 
selected from the group consisting of: mep-1, Hn(nS628), Un(n4256), and lin-65 

25 gene; 

(c) detecting an alteration in the expression of at least one nucleic acid 
of a C. elegans comprising a mutation in said Class B synMuv gene relative to 
the expression of said nucleic acid in a control nematode, wherein an alteration 
in said expression identifies said nucleic acid as a nucleic acid target of class B 

30 synMuv biological activity. 
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28. The method of claim 27, wherein said C. elegans further 
comprises a mutation in a second synMuv gene. 

5 29. The method of claim 27, wherein said C. elegans further 

comprises a mutation in a gene that results in a Vulvaless (Vul) phenotype. 

30. A method for identifying a nucleic acid that binds a synMuv class 
B polypeptide, said method comprising: 

1 0 (a) providing nucleic acids derived from a nematode cell; 

(b) crosslinking said nucleic acids and their associated proteins to form a 
nucleic acid-protein complex; 

(c) contacting said nucleic acid-protein complex with an antibody 
against a polypeptide, selected from the group consisting of MEP-1 , 

15 LIN(n3628), LIN(n4256), and LIN-65; 

(d) purifying said nucleic acid-protein complex using an immunological 
method; and 

(e) isolating said nucleic acid, wherein said isolated nucleic acid is a 
nucleic acid that binds a synMuv class B polypeptide. 

20 

31. The method of claim 30, further comprising the following steps: 

(f) detectably labeling the nucleic acid of step (e); 

(g) contacting a microarray comprising C. elegans nucleic acid 
fragments with said detectably labeled nucleic acid; and 

25 (h) detecting binding of said detectably labeled nucleic acid, wherein 

said binding identifies said nucleic acid as a nucleic acid that binds a synMuv 
class B polypeptide. 
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32. A vector comprising a nucleic acid having at least 95% identity to 
a Class B synMuv gene selected from the group consisting of: mep-1, 
1111(113628), Un(n4256), and lin-65. 

5 33. The vector of claim 32, wherein said synMuv gene is mep-1 

(SEQIDNO:2). 

34. The nucleic acid of claim 33, wherein said synMuv gene 
comprises a mutation selected from the group consisting of n3680, nil 02, and 
10 n3703.. 

3 5 . The vector of claim 32, wherein said synMuv gene is Hn(n3 628) 
(SEQIDNO:24). 

15 36. The vector of claim 32, wherein said synMuv gene is Un(n4256) 

(SEQIDNO:26). 

37. The vector of claim 36, wherein said synMuv gene is lin-65 (SEQ 
ID NO:28). 

20 

38. An isolated cell comprising the vector of claim 32. 

39. A nematode comprising the nucleic acid of claim 32. 



25 40. A nematode comprising a mutation in a Class B synMuv gene 

selected from the group consisting of: mep-1, lin(n3628), Un(n4256), and lin- 
65. 



30 



41 . The nematode of claim 40, wherein said mutation is a mep-1 
mutation selected from the group consisting of n3680, n3702, and n3703. 
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5 



42. A purified nucleic acid comprising a sequence that hybridizes 
under high stringency conditions to a Class B synMuv nucleic acid selected 
from the group consisting of: mep-l y lin(n3628) > Un(n4256), and lin-65. 

38. An antibody against a Class B synMuv polypeptide selected from 
the group consisting of: MEP-1, LIN(n3628), LIN(n4256), and LIN-65. 



38. A method for identifying a compound that treats a condition 
1 0 characterized by. inappropriatexell death, said method comprising the steps of: 

(a) contacting a nematode comprising a mutation in a Class B synMuv 
gene selected from the group consisting of: mep-1, lin(n3628) 9 lin(n42!56), and 
lin-65 with a candidate compound; 

(b) detecting a muv phenotype in said contacted nematode relative to a 
1 5 control nematode; wherein a candidate compound that alters the phenotype of 

said contacted nematode relative to said control nematode is a compound that 
treats a condition characterized by inappropriate cell death. 



20 



39. The method of claim 38, wherein said cell is in a nematode. 

40. The method of claim 38, wherein said alteration is an alteration in 
synMuv phenotype. 
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41. A method for identifying a compound that treats a neoplasia, said 
method comprising the steps of: 

(a) contacting a cell comprising a mutation in a gene encoding 
KIAA1732 and a second mutation in a synMuv nucleic acid, or an ortholog 

5 thereof, with a candidate compound; 

(b) detecting a phenotypic alteration in said contacted cell relative to a 
control cell; wherein a candidate compound that alters the phenotype of said 
contacted cell relative to said control cell is a compound that treats a neoplasia. 

10 ... 42. The method of claim 1, wherein said synthetic multivulval gene 
is a synMuv class A gene. 

43 The method of claim 1, wherein said cell is an isolated 
mammalian cell. 

15 

44. The method of claim 1 , wherein said phenotypic alteration is a 
decrease in cell proliferation. 

45. A method for identifying a candidate compound that treats a 
20 neoplasia, said method comprising: 

(a) providing a cell having a mutation in a nucleic acid encoding 
K1AA1732 and having a second mutation in a synMuv nucleic acid, or 
ortholog thereof; 

(b) contacting said cell with a candidate compound; and 

25 (c) detecting a decrease in proliferation of said cell contacted with said 

candidate compound relative to a control cell not contacted with said candidate 
compound, wherein a decrease in proliferation identifies said candidate 
compound as a candidate compound that treats a neoplasia. 
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46. The method of claim 8, wherein said cell has a mutation in Dp, 
E2F, or histone deaceytlase. 

47. The method of claim 5, wherein said cell is an isolated 
5 mammalian cell. 

48. A method of identifying a compound that treats a neoplasia, said 
method comprising: 

(a) providing a cell expressing a nucleic acid having at least 95% 
.10 identity, to. a nucleic_acid.tb.at encodes XIAA1732;.. 

(b) contacting said cell with a candidate compound; and 

(c) monitoring the expression of said nucleic acid, an alteration in the 
level of expression of said nucleic acid indicates that said candidate compound 
is a compound that treats a neoplasia. 

15 

49. The method of claim 8, wherein said gene comprises a reporter 

gene. 

50. The method of claim 8, wherein said reporter gene comprises lacZ, 
20 gfp, CAT, or luciferase. 

5 1 . The method of claim 8, wherein said expression is monitored by 
assaying protein level. 

25 52. The method of claim 8, wherein said expression is monitored by 

assaying nucleic acid level. 

53 . The method of claim 1 2, wherein said cell is an isolated 
mammalian cell. 

30 



108 



WO 2004/024084 



PCT/US2003/028626 



54. A method for identifying a candidate compound that treats a 
neoplasia, said method comprising: 

(a) providing a cell expressing a KIAA1732 polypeptide; 

(b) contacting said cell with a candidate compound; and 

5 (c) comparing the expression of said polypeptide in said cell contacted 

with said candidate compound to a control cell not contacted with said 
candidate compound, wherein an increase in the expression of said polypeptide 
identifies said candidate compound as a candidate compound that treats a 
neoplasia. 

10 

55. The method of claim 54, wherein said cell is an isolated 
mammalian cell. 

56. The method of claim 54, wherein said expression is monitored 
15 with an immunological assay. 

57. A method for identifying a candidate compound that treats a 
neoplasia, said method comprising: 

(a) providing a cell expressing a KIAA1732 polypeptide; 
20 (b) contacting said cell with a candidate compound; and 

(c) comparing the biological activity of said polypeptide in said cell 
contacted with said candidate compound to a control cell not contacted with 
said candidate compound, wherein an increase in the biological activity of said 
polypeptide identifies said candidate compound as a candidate compound that 

25 treats a neoplasia. 

58. The method of claim 57, wherein said biological activity is 
monitored with an enzymatic assay. 
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59. The method of claim 57, wherein said biological activity is 
monitored with an immunological assay. 

60. The method of claim 57, wherein said biological activity is 
5 methyl transferase activity. 

61 . A method for identifying a nucleic acid that binds KIAA1 732, 
said method comprising: 

(a) providing nucleic acids derived from a mammalian cell; 
10 (b) crosslinking said nucleic acids and their associated proteins to form a 

nucleic acid-protein complex; 

(c) contacting said nucleic acid-protein complex with an anti-KIAA1732 
antibody; 

(d) purifying said nucleic acid-protein complex using an immunological 
15 method; and 

(e) isolating said nucleic acid, wherein said isolated nucleic acid is a 
nucleic acid that binds KIAA1732. 

62. The method of claim 61, further comprising the following steps: 
20 (f) detectably labeling the nucleic acid of step (e); 

(g) contacting a microarray comprising human nucleic acid fragments 
with said detectably labeled nucleic acid; and 

(h) detecting binding of said detectably labeled nucleic acid, wherein 
said binding identifies said nucleic acid as a nucleic acid that binds KIAA1732. 

25 

66. A vector comprising a nucleic acid having at least 95% identity to 
(SEQIDNO:30). 

67. An isolated cell comprising the vector of claim 26. 

30 
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68. A method for identifying a compound that treats a neoplasia, said 
method comprising the steps of: 

(a) contacting a nematode comprising a mutation in a Class C synMuv 
gene selected from the group consisting of trr-1, hat-1, epc-1, and ssl-1 with a 

5 candidate compound; and 

(b) detecting an alterated phenotype in said contacted nematode relative 
to a control nematode; wherein a candidate compound that alters the phenotype 
of said contacted nematode relative to said control nematode is a compound 
that treats a neoplasia. 

10 

69. The method of claim 68, wherein said alteration is an alteration in 
vulval phenotype. 

70. The method of claim 68, wherein said alteration is an alteration in 
15 sterility. 

71. The method of claim 68, wherein said synMuv class C gene is 

trr-L 

20 72. The method of claim 71, wherein said mutations are selected 

from the group consisting of n3 630, n3637, n3704, n3708, n3709, and n3712. 

73. A method for identifying a candidate compound that treats a 
neoplasia, said method comprising: 
25 (a) providing a cell having a mutation in a Class C synMuv gene 

selected from the group consisting of trr-1, hat-1, epc-1, and ssl-1 nucleic acid 
and having a second mutation in a synMuv nucleic acid or ortholog thereof; 

(b) contacting said cell with a candidate compound; and 

(c) detecting a decreased proliferation of said cell contacted with said 
30 candidate compound relative to a control cell not contacted with said candidate 
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compound, wherein a decrease in proliferation identifies said candidate 
compound as a candidate compound that treats a neoplasia. 

» 

74. The method of claim 73, wherein said cell is in a nematode. 

5 

75. The method of claim 73, wherein said nematode displays an 
alteration in a synMuv phenotype. 

76. The method of claim 73, wherein said cell comprises a mutation 
10 in a class A orxlass JB synMuv gene.. 

77. A method for identifying a compound that treats a neoplasia, said 
method comprising the steps of: 

(a) contacting a nematode comprising a mutation in a Class C synMuv 
15 gene selected from the group consisting of trr-1, hat-1, epc-1, and ssl-1 and a 

second mutation in a Class A synthetic multivulval gene with a candidate 
compound; 

(b) detecting an altered phenotype in said contacted nematode relative 
to a control nematode; wherein a candidate compound that alters the phenotype 

20 of said contacted nematode relative to said control nematode is a compound 
that treats a neoplasia. 

78. The method of claim 77, wherein said alteration is an alteration in 
synMuv phenotype. 

25 

79. The method of claim 77, wherein said alteration is an alteration in 
sterility. 
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80. A method for identifying a compound that treats a neoplasia, said 
method comprising the steps of: 

(a) contacting a nematode comprising a mutation in a Class C synMuv 
gene selected from the group consisting of tir-1, hat-1, epc-1, and ssl-1 and a 

5 second mutation in a Class B synthetic multivulval gene with a candidate 
compound; 

(b) detecting an altered phenotype in said contacted nematode relative 
to a control nematode; wherein a candidate compound that alters the phenotype 
of said contacted nematode relative to said control nematode is a compound 

1 0 that treats a neoplasia. 

8 1 . The method of claim 80, wherein said alteration is an alteration in 
synMuv phenotype. 

15 82 . The method of claim 80, wherein said alteration is an alteration in 

sterility. 

83 . A method for identifying a candidate compound that treats a 
neoplasia, said method comprising: 

20 (a) providing a cell having a mutation in a Class C synMuv gene 

selected from the group consisting of trr-1, hat-1, epc-1, and ssl-1 and having a 
second mutation in a synMuv gene or ortholog thereof; 

(b) contacting said cell with a candidate compound; and 

(c) detecting a decreased proliferation of said cell contacted with said 
25 candidate compound relative to a control cell not contacted with said candidate 

compound, wherein a decrease in proliferation identifies said candidate 
compound as a candidate compound that treats a neoplasia. 

84. The method of claim 83, wherein said cell is in a nematode. 



113 



WO 2004/024084 



PCT/US2003/028626 



85. The method of claim 83, wherein said nematode displays an 
alteration in a synMuv phenotype. 

86. A method of identifying a compound that treats a neoplasia, said 
5 method comprising: 

(a) providing a cell expressing a nucleic acid having at least 95% 
identity to a Class C synMuv nucleic acid selected from the group consisting of 
tnr-U hat-1, epc-1, and ssl-l\ 

(b) contacting said cell with a candidate compound; and 

10 (c) monitoring the expression of said nucleic acid, an alteration in the 

level of expression of said nucleic acid indicates that said candidate compound 
is a compound that treats a neoplasia. 

87. The method of claim 86, wherein said gene comprises a reporter 

15 gene. 

88. The method of claim 86, wherein said reporter gene comprises 
lacZ, gfp> CAT, or luciferase. 

20 89. The method of claim 86, wherein said expression is monitored by 

assaying protein level. 

90. The method of claim 86, wherein said expression is monitored by 
assaying nucleic acid level. 

25 

91 . The method of claim 86, wherein said nucleic acid is in a 
nematode. 
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92. A method for identifying a candidate compound that treats a 
neoplasia, said method comprising: 

(a) providing a cell expressing a a Class C synMuv polypeptide selected 
from the group consisting of TRR-1, HAT-1, EPC-1, and SSL-1 polypeptide; 
5 (b) contacting said cell with a candidate compound; and 

(c) comparing the expression of said polypeptide in said cell contacted 
with said candidate compound to a control cell not contacted with said 
candidate compound, wherein an increase in the expression of said polypeptide 
identifies said candidate compound as a candidate compound that treats a 
10 neoplasia. 

93 . The method of claim 92, wherein said cell is in a nematode. 

94. The method of claim 92, wherein said expression is monitored 
1 5 with an immunological assay. 

95. A method for identifying a candidate compound that treats a 
neoplasia, said method comprising: 

(a) providing a cell expressing a Class C synMuv polypeptide selected 
20 from the group consisting of TRR-1, HAT-1, EPC-1, and SSL-1; 

(b) contacting said cell with a candidate compound; and 

(c) comparing the biological activity of said polypeptide in said cell 
contacted with said candidate compound to a control cell not contacted with 
said candidate compound, wherein an increase in the biological activity of said 

25 polypeptide identifies said candidate compound as a candidate compound that 
treats a neoplasia. 

96. The method of claim 95, wherein said cell is in a nematode. 
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97. The method of claim 95, wherein said biological activity is 
monitored with an enzymatic assay. 

98. The method of claim 95, wherein said biological activity is 
5 monitored with an immunological assay. 

99. A method of identifying a nucleic acid target of a synMuv Class 
C polypeptide, said method comprising: 

(a) mutagenizing a C. elegans comprising a first mutation in a Class C 
10 synMuv gene selected Jfxom. the. group consisting-of trr-A, hatrl, epc-1, and ssl- 

1 and a second mutation in a Class A or Class B synMuv gene; 

(b) allowing said C. elegans to reproduce; 

(c) selecting a C. elegans comprising a mutation that suppresses a 
synMuv phenotype; wherein said mutation identifies a nucleic acid target of a 

1 5 synMuv class C polypeptide. 

100. The method of claim 99, wherein said second mutation is in a 
class A synMuv gene. 

20 .101. The method of claim 31, wherein said second mutation is in a 

Class B synMuv gene. 

1 02. A method for identifying a a nucleic acid target of a synMuv 
Class C polypeptide, said method comprising: 
25 (a) providing a C. elegans comprising a mutations in a Class C synMuv 

gene selected from the group consisting of trr-1, hat-], epc-1, and ssl-1 ; 

(b) growing said C. elegans on bacteria expressing a dsRNA; and 

(c) identifying a dsRNA that suppresses a synMuv phenotype; wherein 
said dsRNA identifies a nucleic acid target of a synMuv class C polypeptide. 

30 
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103. A method for identifying a a nucleic acid target of a synMuv 
class C polypeptide, said method comprising: 

(a) providing a G elegans comprising mutations in a Class C synMuv 
gene selected from the group consisting of trr-1, hat-1, epc-1, and ssl-1 and in 

5 a Class A or Class B synMuv gene; 

(b) growing said C elegans on bacteria expressing a dsRNA; and 

(c) identifying a dsRNA that suppresses a synMuv phenotype; wherein 
said dsRNA identifies a nucleic acid target of a synMuv class C polypeptide. 

10 .104. A method of identifying a nucleic acid whose expression is 

modulated by a synMuv class C polypeptide, said method comprising: 

(a) providing a microarray comprising fragments of nematode nucleic 

acids; 

(b) contacting said microarray with detectably labeled nucleic acids 
15 derived from a nematode comprising a mutation in a Class C synMuv gene 

selected from the group consisting of trr-1, hat-1, epc-l 9 and ssl-1 gene; 

(c) detecting an alteration in the expression of at least one nucleic acid 
of a C. elegans comprising a mutation in said synMuv class C gene relative to 
the expression of said nucleic acid in a control nematode, wherein an alteration 

20 in said expression identifies said nucleic acid as a nucleic acid modulated by a 
synMuv class C polypeptide. 

105. The method of claim 104, wherein said C. elegans further 
comprises a mutation in a synMuv A or synMuv Bgene. 

25 

106. The method of claim 104, wherein said G elegans further 
comprises a mutation in a gene that results in a Vulvaless (Vul) phenotype. 

107. The method of claim 104, wherein said gene encodes LET-60. 
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108. A method for identifying a nucleic acid target of a synMuv class 
C polypeptide, said method comprising: 

(a) providing nucleic acids derived from a nematode cell; 

(b) crosslinking said nucleic acids and their associated proteins to form a 
5 nucleic acid-protein complex; 

(c) contacting said nucleic acid-protein complex with an antibody that 
binds a polypeptide selected from the group consisting of TRR-1, HAT-1, 
EPC-1,ANDSSL-1; 

(d) purifying said nucleic acid-protein complex using an immunological 
10 jmethod;juKL 

(e) isolating said nucleic acid, wherein said isolated nucleic acid is a 
nucleic acid that binds a synMuv class C polypeptide. 



109. The method of claim 108, further comprising the following steps: 
15 (f) detectably labeling the nucleic acid of step (e); 

(g) contacting said detectably labeled nucleic acid with a microarray 
comprising C. elegans nucleic acid fragments; and 

(h) detecting binding of said detectably labeled nucleic acid, wherein 

said binding identifies said nucleic acid as a nucleic acid target of a synMuv 

» 

20 class C polypeptide. 
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FIGURE 2 

mep-t genomic sequence • 

TCACACACTCAT6ACATACACACATCATTTCGCCTCACACACCGCGCC6TCG 

CCATCCGCACC6CCCGGGTGGGACGTGTTCAAACTTTTCGGTTTTCGTAAT 

TAATAGTGAGCCCCGGTTTATTCGCTTTGAGMTCAGTATAATGGATATATG 

AGATTGTGTMTTAGGTTGCGTGCTTGMCTTTTAAMTTMCTGTTTTAAAT 

TTATCTGCCmATCGTTACAGTAMTCATTTTGATGMCTTnCGGATGMT 

CATAATGAAGTACGCAGCGCTCTAACAAAATGTGTTTGTAAATTCCAATTGC 

TACMGTTGCCCGGCTTATTT7TTGGTGATTGAAGCATGATTCTGTTGACGC 

TCCCGACGCGGAATACCAGGACGGACCGATGAGAGAGTACTGCCAGTGAA 

GAGACGCATGCGAGCAGGACGAGTGCTGCTCACCCTTCTTCTCAGCGTCG 

GCGGCTGCGACCAGCGGCCGAGGAAGGGGAGGAGAGAGGCCGATTTGGC 

TGCGTACCACGTTTGATACTCAGTCACTTACCACAGCTGGTTCTCTTGTGCG 

TTCAAATCTGGCTTGCCGCGCGCGCGCATTTTATTCCTACCAGTTTGAATCT 

CCCACCTCTCCGACTGTMCTGTCCTMTTTGCTTCCTTCTCATCACTCTCTC 

TTTGCCTATTTCTCACTATCTAGACTCTATTTTTCCAGA^^GTCACCGCCGA 

CGAGACGGTACTCGCCACAACGACCAACACCACTTCCATGTCTGTGGMCC 

AACGGATCCGAGAAGCGCTGGTGAATCGTCCTCAGATTCGGAGCCAGACA 

CMTTGAGGTGAGGAAAAGTmGGGMTTTAAATCTGAATAAAACGTTTTCA 

GCAGCTGAAGGCAGAACAGCGCGAAGTGATGGCCGACGCGGCGAATGGTT 

CCGAAGTCAACGGAAATCAAGAGAACGGAAAAGAGGAAGCGGCATCTGCA 

GACGTGGAAGTGATCGAGATAGATGACACCGAAGAGTCTACGGATCCCTCA 

CCTGATGGATCTGATGAAAACGGTGATGCTGCATCTACATCGGTTCCAATC 

GAAGAGGAAGCGCGTAAAAAGGATGAGGGGGCTTCCGAAGTGACTGTdGC 

ATCATCTGAGATTGAACAAGACGATGATGGCGATGTTATGGAAATCACTGAG 

GAGCCGMCGGAAAGTCGGAGGATACTGCCAACGGAACAGGTGTGTTTTAT 

MTTTTACCAAGTTTMTTTTMCTTTCTATTTTCAGTTACTGAGGAGGTGCTA 

GATGAAGAGGAGCCAGAACCTTCCGTAAACGGAACAACTGAGATCGCTACA 

GAGAAAGAGCCAGAAGATTCTTCAATGCCTGTCGAACAGAATGGGAAGGGT 

GTGAAGCGGCCTGTCGAATGCATCGAACTCGACGACGACGATGATGACGA 

GATTCAGGAAATTTCTACCCCTGCCCCAGCTAAAAAAGCTAAAATTGATGAT 

GTCAAGGCGACAAGCGTTCCAGAAGAGGACAACAATGAGCAGGCGCAGAA 

GAGATTGCTCGACAAGCTGGAAGAGTATGTGAAGGAGCAGAAGGATCAACC 

ATCCAGCAAAAGCCGAAAAGTTCTGGACACTCT.TCTCGGAGCAATCAATGC 

GCAAGTTCAAAAG G AGCCTCTGTCGGTTCGG AAGCTGATCCTGGACAAAGT 

TCTCGTTCTCCCAAACACAATATCATTCCCACCAAGTCAAGTTTGCGACTTAT 

TGATTGAGCACGATCCCGAAATGCCTTTGACGAAGGTTATCAACAGGATGTT 

TGGAGAAGAAAGACCAAAGTTGAGTGATTCCGAGAAACGAGAGAGAGCTCA 

GCTGAAACAACATAATCCTGTTCCAAATATGACAAAACTGCTCGTGGACATT 

GGACAGGATCTCGTTCAAGAAGCTACCTATTGTGATATAGTTCACGCGAAGA 

ATOTQ^GAGGTGJCCAAAAAATCTTGAAACCTATMGCMGTCGCTGCGCA 

'GTT G"A^&^TTTX3^ 

GAAAATGCATCGATGCGACGTCTGTGGATTCCAGACGGAATCAAAGCTGGT 

TATGAGCACTCACAAGGAGAATTTGCACTTCACAGGATCCAAATTCCAGTGC 

ACCATGTGTAAAGAGACGGACACGAGTGAGCAAAGAATGAAGGATCACTAC 

TTGTAAGTTTTTTTTTTTTCATCTTTCAATATTCATTTAATTACAGCGAAACTC 

ATCTTGTTATTGCAAAATGGGAAGAGAAGGAGTCCAAGTATCCATGTGCAAT 
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CTGCGAAGAAGACTTCAATTTCAAA6GTGTCCGT6AGCAGCATTACAA6CA 

GTGCAAGAAGGACTACATTCGCATTCGAAACATCATGATGCCGAAGCAAGA 

CGATCATCTCTATATCAACAGATGGCTCTGGGAGAGGCCCCAATTGGATCC. 

CAGCATTCTTCAACAGCAGCMCAAGCTGCTCTTCAGCAAGCTCAACAAAAG 

AAGCAACAGCAACTTCTGCATCAACAGCAAGCAGCACAAGCTGCAGCCGCT 

GCGCAACTCTTACGGAAGCAACAATTACAACAGCAACAACAACAGCAACAG 

gctcgtcttcgtgagcaacagcaagcggcccaattccggcaagtggctcaa 

ctgctgcmcaacaatcagcgcaggctcaacgtgcacagcagaatcaagga 

aatgtgaatcataacactctgattgcaggtaatagctaaacatattttaaata 

agtattttgtataattatttatatttcagcaatgcaagcgtcgttgcgtagag 

gtggtcaacaaggaaattcgctggcagtttctcaacttctccaaaagcaaat ' 

ggcagctttgaagtcgcaacaaggagctcaacaacttcaggctgcggtgaa 

ctccatgagaagccagmcagtcaaaagacgccaacacacagaagttcgaa 

acttgttactacgccgtctcatgctactgttggctcttcttcagctcccacg 

tttgtatgcgaaatttgtgatgcgtcagtgcaggaaaaggagmgtatctac 

agcatcttcaggtmttttmgamcgtttctatttcmtttcaaaaccgatt 

attaaatatcttaaacatcacattttcagactactcataagcagatggttgga 

aaagtgctgcaggacatgtcgcaaggagctccactggcatgttctcgatgc 

cgtgacagattctggacttatgaagggttggagcggcacttggtgatgtcg 

catggtctcgtcactgctgatctgctcctcaaagcgcaaaagaaggaagac' 

6gaggtcgatgcaagacatgcggcaagaactatgcgttcaacatgcttcaa 

cacttggtagctgatcatcaagtgaagttgtgctcggctgaaatcatgtact 

cgtgcgatgtgtgcgcgttcaaatgctcgagttatcagactctggaagccc 

atctcacttcaaatcacccaaaaggagataagaagacatcaacaccagcaaa 

aaaagatgattgtattactctggatgatgstaggaaaacgaatggcttatc 

ccgttctacgaatgagtgctggaaacattcttcacaatgatct.caattatttc 

tcttattctttacattcmtcattttaaatcaccagttctcccac7ttcattga 

tatacacattctattgcgggttccggaaccgaaatcaatcagtactttacttt 

a rrtccccmtttttctcttcatgatatctggtttattctcgcatcttccccta 

ccttcaaaactccctai mill 1 i i caaaacctaactaccccacaattatcatg 

taaaatcaaatt6caattccccataa6ac agat cagtatacactttcacttca 

tacgtctgttgttctcccccatctcatactttttttaccatttgtccagttaa 

gatttttggaagatatctat 
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ATGG^ACCGCCGACGAGACGGTACTCGCCACMCGACCMCACCACTTCC 

ATGTCTGTGGAACCAACGGATCCGAGAAGCGCTGGTGAATCGTCCTCAGAT 

TrGGAGCCAGACAGAATTGAGCAGCTGAAGGCAGAACAGCGCGAAGTGAT 

rrCCGACGCGGCGAATGGTTCCGAAGTCAACGGAAATCAAGAGAACGGAA 

AAGAGGAAGCGGCATCTGCAGACGTGGAAGTGATCGAGATAGATGACACC 

Saagagtctacggatccctcacctgatggatctgatgaaaacggtgatgct 
rcatctacatcggttccaatcgaagaggaagcgcgtaaaaaggatgagggg 
rrttccgaagtgactgtggcatcatctgagattgaacaagacgatgatggc 
ratgttatggaaatcactgaggagccgaacggaaagtcggaggatactgcc 
aacggaacagttactgaggaggtgctagatgaagaggagccagaaccttcc 
rtaaacggaacaactgagatcgctacagagaaagagccagaagattcttca 
atgcctgtcgaacagaatgggaagggtgtgaagcggcctgtcgaatgcat 
rgaactcgacgacgacgatgatgacgagattcaggaaatttgtacccctgc 
cccagctaaaaaagctaaaattgatgatgtcaaggcgacaagcgttccaga 
' agaggacaacaatgagcaggcgcagaagagattgctcgacaagctggaag 
agtatgtgmggagcagmggatcaaccatccagcaaaagccgaaaagttc 
tggacactcttctcggagcaatcaatgcgcaagttcaaaaggagcctctgt 
rggttcggaagctgatcctggacaaagttctcgttctcccaaacacaatatc 
attcccaccaagtcaagtttgcgacttattgattgagcacgatcccgaaatg 

rCTTTGACGMGGTTATCMCAGGATGTTTGGAGAAGAAAGACCAAAGTTGA- 

rTGATTCCGAGAAACGAGAGAGAGCTCAGCTGAAACAACATAATCCTGTTC 

rAAATATGACAAAACTGCTCGTGGACATTGGACAGGATCTCGTTCAAGAAG 

CT^CCTATTGTGATATAGTTCACGCGAAGAATCTTCCAGAGGTGCCAAAAAA 

TCTTGAAACCTATAAGCAAGTCGCTGCGCAGTTGAAACCAGTTTGGGAGAC . 

ATTGAAACGCAAAAATGAGCCGTACMGTTGAAAATGCAJCGATGCGACGT 

CTGTGGATTCCAGACGGAATCAAAGCTGGTTATGAGCACTCACAAGGAGAA 

TTTGCACTTCACAGGATCCAAATTCCAGTGCACCATGTGTAAAGAGACGGAC 

ACGAGTGAGCAAAGAATGAAGGATCACTACrrCGAAACTCATCTTGTTATTG 

CAAAATCGGAAGAGAAGGAGTCCAAGTATCCATGTGCAATCTGCGAAGAAG 

ACTTCAATTTCAAAGGTGTCCGTGAGCAGCATTACAAGCAGTGCAAGAAGG 

ACTACATTCGCATTCGAAACATCATGATGCCGAAGCAAGACGATCATCTCTA 

TATCAACAGATGGCTCTGGGAGAGGCCCCAATTGGATCCCAGCATTCTTCA 

ACAGCAGCAACAAGCTGCTCTTCAGCAAGCTCAACAAAAGAAGCAACAGCA 

ACTTCTGCATCAACAGCAAGCAGCACAAGCTGCAGCCGCTGCGCAACTCTT 

ACGGAAGCAACAATTACAACAGCAACAACAACAGCAACAGGCTCGTCTTCG 

TGAGCAACAGCAAGCGGCCCAATTCCGGCAAGTGGCTCAACTGCTGCAACA 

ACAATCAGCGCAGGCTCAACGTGCACAGCAGAATCAAGGAAATGTGAATCA 

TAACACTCTGATTGCAGCAATGCAAGCGTCGTTGCGTAGAGGTGGTCAACA 

AGGAMTTCGCTGGCAGTTTCTCAACTTCTCCAAAAGCAAATGGCAGCTTTG 

-f/.G^GGJC^ 



GAAATTTGTGATGCGTCAGTGCAGGAAAAGGAGAAGTATCTACAGCATCTTC 
AGACTACTCATAAGCAGATGGTTGGAAAAGTGCTGCAGGACATGTCGCAAG 
GAGCTCCACTGGCATGTTCTCGATGCCGTGACAGATTCTGGACTTATGAAG 
GGTTGGAGCGGCACTTGGTGATGTCGCATGGTCTCGTCACTGCTGATCTGC 
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FIGURE 3 



tcctcaaagcgcaaaa6aaggaa6acggaggtcgatgcaagacatgcggc 
aagaactatgcgttcaacatgcttcaacacttggtagctgatcatcaagtga 
agttgtgctcggctgaaatcatgtactcgtgcgatgtgtgcgcgttcaaat 
gctcgagttatcagactctggaagcccatctcacttcaaatcacccaaaagg 
agatmgaagacatcmcaccagcaaaaaaagatgattgtattactctggat 
. gattaa 
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MEP-1 protein 

MyTADETVl^mNTTSMSVEPTDPRSAGESSSDSEPDTIEQLKAEQREVMAD 

AANGSEVNGNQENGKEEMSADVEVIEIDDTEESTDPSPDGSDENGDAASTSV 

PIEEEARKKDEGASEVTVASSEIEQDDDGDVMEITEEPNGKSEDTANGTVTEEV 

LDEEEPEPSVNGTTEIATEKEPEDSSMPVEQNGKGVKRPVECIELDDDDDDEIQ 

EISTPAPAKKAKIDDVKATSVPEEDNNEQAQKRLLDKLEEYVKEQKDQPSSKSR 

KVLDTLLGAINAQVQKEPLSVRKLILDKVLVLPNTISFPPSQVCDLLIEHDPEMPL 

TKVINRMFGEERPKLSDSEKRERAQLKQHNPVPNMTKLLVDIGQDLVQEATYC 

DIVHAKNLPEVPKNLETYKQVAAQLKPVWETLKRKNEPYKLKMHRCDVC6FQT 

ESKLVMSTHKENLHFTGSKFQCTMCKETDTSEQRMKDHYFETHLVIAKSEEKE 

SKYPCAICEEDFNFKGVREQHYKQCKKDYIRIRNIMMPKQDDHLYINRWLWER 

PQLDPS1LOOOOQAALQQAQQKKQQOLLHQQQAAQAAAAAQLLRKQQLQQQ 

QQQQQARLREOOQAAQFRQVAQLLOOQSAQAQRAQQN'QGTnIVNHNTLIAAM " 

qaslrrggoognslavsqllqkqmmlksoqgaoqlqaavnsmrsqnsqkt 

pthrtptfvceicdasvqekekylqhlqtthkomvgkvlqdmsqgaplacsr 

crdrfwtyeglerhlvmshglvtadlllkaqkkedggrcktcgknyafnmlq 

hlvadhovklcsaeimyscdvcafkcssyqtleahltsnhpkgdkktstpakk 

ddcitldd 
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wild type (n=31) 
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M FAT domain (FRAP, ATM, TRRAP-like). 
□ ATM/PI-3 kinase-like 
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gaggaa<3atgtagacgacgattc6gtttccgtactctcatgacrrttggcg 
aaaatcctcacgmttctttttccgtcatacgttgagttaaaaatctggcgat 
gtmcg>aagmtga6mgagcgtttgatgtttgc cataagta gattttactg 
aaataac3aaaaagctttaattaaatataatgatgai 1 1 1 1 1 1 1 i ccaactcact 
tttcgcattgttctgatgtttttagttctgtggctctgcgaaggaaaagtcg 
mtaaatgcagcgamtttcctgttgtttgtgtattgtacattagacattgaa 
gatgatcat.ctaaagcagattccaaagcgattcgggtgtctctaaacgatta 
tmcatttttamgcttttgcctmtrttaatccttactcgtcgtcatcatcaa 
acttgagactgaaagagagaagtttgttccaaaatgggtcataatcgtcgac 
aggttccaaaccgctgagtttcttcagataaatattctcctgtaagaccgtt 
tccttggttataactgatcccatgtgtctgaaatttgttattacactgttaat 
aatcataaaaataaaagaaaaagtcaagaaagggtcaaatattaatcaggtca 
catcttttttattcaataaaatctcctctctcgttcgtggcaatgcacgtgaa 
atgcgccaacaaccgcgagtgcgccaacacacacacatacgcgtcagcag 
acaattcgctctcgtttgamtttagttgtttctttgtttctgctgaaataat 
gtcagttttccgataatttcagcgttttctgactgatttttcttgttgcattc 
acttcctaatagttcattctactccattcttcattttataatctgtttccttcg 
caatttagtgaattaaacacgtaaatcttgtttcagataaattattcaaatagt 
tgcacaaagctcaatagtttagaagtatcttcagtgctggtcactaatacaa 



aMgatccggctatggcttctccaggctatcggtctgtgcagtccgatcg 

gagtaatcacctaacagagctggaaacgagaattcaaaatcttgccgataat 
tcacaaagagatgatgtcaaattgaaaatgttacaagttagtttcaataattc 
gtgttaagtaatcaatttgttcggttgcaggagatttggagcacaatcgaaa 
atcatttcaca ctaa gttcgcacgagaaagtcgtggagaggctcattctctc 
gttcctacaagttttctgcaacacaagtccacagttcattgctgaaaacaat 
acacmcagcttcgamgttmtgcttgaaatcattcttcgactttcgaacg 
tagaagccatgaaacatcatagcaaagaaattatcaagcagatgatgaggct 
aatcaccgtggaaaatgaggagaatgccaatttggctatcaaaattgtcacc 
gatcaagggagaagtaccggcaaaatgcaatattgcggagaggtttcacag 
ataatggtctccttcaaaacaatggtcattgatctgacggcgagtggtcga 
gctggtgatatgttcaacataaaagagcataaagctccaccgtcaactagct 
ccgacgagcaagtcatcactgaatatttgaagacttgctactatcaacaaac 
ggttcttctcaacggaacggaaggaaaaccgccattaaaatacaatatgatt 
ccatcagctcatcagtcaacgaaggtgctcctggaggttccgtatctcgtg 
attttcttctatcaacatttcaaaacagcgatccaaaccgaagcgcttgattt 
catgaggcttggtcttgattttctaaatgtcagagttccagacgaggataaa 
ctcaaaacaaatcaaataataaccgatgattttgtcagtgcacagtcccgat 
tcctgt cattc gtcmcattatggctaagattccagcggtaagtttcgttttt 

TC A u AGTTTmnCTi3TMTCCTGATTTTTATTTTTCAGm 

TGCAAMTGGACCGCTTCTAGfGTCGGGMCMTGCAGATGCTCGAG 

GCCCGGCTGATCTGATAAGTGTCCGACGAGAAGTTCTGATGGCTTTGAAGT 

ATTTCACATCTGGAGAAATGAAGTCGAAATTCTTTCCAATGCTACCTCGACT 

CATCGCTGAGGAGGTTGTTCTGGGAACAGGATTCACTGCGATTGAGCATTT 

GCGAGTTTTCATGTATCAAATGCTAGCAGATCTGTTGCATCACATGCGAAAT 

TCTATAGACTATGAAATGATCACACAGTAAGTTTGAATAAGACTTTCTGATGA 
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AAAATGTTGAAATTTCAGCGTGATTnCGTATTCTGTCGCACTCTTCACGATC 

CTAACAACTCTTCTCAAGTCCAGATTATGTCTGCTCGGCTGCTCAACTCACT 

GGCCGAATCTCTGTGCAAAATGGATTCACATGATACCGTAAGACTTATTCTA 

TCMTAATCGTATCTCACTTCGAAATAAGTTTCAGACTCGTGATCTGCTCATT 

GAAATCCTGGAGTCGCACGTGGCCAAGCTCAAAACTCTTGCAGTCTATCAC 

ATGCCTATTCTCTTCCAACAATACGGAACCGAAATAGACTACGAATACAAAA 

GtTATGAGAGAGACGCCGAGAAACCTGGAATGAATATCCCAAAGGACACTA 

TACGAGGAGTACCGAAACGAAGAATCCGTCGGCTCTCCATTGATTCAGTTG 

AAGAGCTGGAATTCCTGGCATCAGAACCATCCACGTCGGAAGATGCAGATG 

AGAGTGGTGGAGATCCGAACAAGCTTCCTCCGCCAACAAAAGAGGGAAAGA 

AAACGTCTCCCGAAGCGATTTTAACCGCCATGTCAACGATGACACCTCCTC 

CATTGGCAATTGTTGAAGCTCGAAATCTTGTGAAGTATATAATGCATACGTG 

TAAATTCGTGACAGGACAATTGAGMTCGCCCGGCCATCACAGGATATGTAT 

CATTGTTCGAAGGAGCGAGATTTATTCGAACGTCTTCTACGATATGGTGTAA 

TGTGTATGGATGTATTCGTGCTTCCAACAACTCGAAATCAACCACAAATGCA 

TTCTTCAATGCGGACAAAAGATGAGAAAGATGCTCTGGAGTCGTTGGCAAA 

CGTTTTTACAACAATCGACCATGCGATATTCCGGGAAATCTTCGAAAAGTAT 

ATGGATTTCTTGATTGAAAGAATTTACAATCGGAACTATCCATTGCAATTGAT 

GGTGAACACCTTCTTGGTTCGAAATGAAGTGCCATTCTTCGCATCTACGATG 

CTTTCATTCTTGATGTCTCGAATGAAATTGCTGGAAGTTAGCAATGACAAGA 

CGATGCTATATGTGAAGCTCTTCAAAATTATCTTCTCCGCCATCGGAGCCAA 

TGGCT.CTGGGCTTCATGGAGATAAAATGCTCACTTCATACCTCCCAGAGATT 

CTCAAACAGTCAACTGTCTTGGCATTAACAGCTCGTGAACCTCTCAACTATT 

TCCTTTTGCTTCGTGCATTGTTCCGCAGTATTGGTGGTGGCGCTCAGGATAT 

TTTGTATGGAAAGTTCCTGCAGTTACTGCCAAATCTTCTTCAATTCTTGAATA 

MTTGACGGTGAGTTTCATTTTTTGATATATCGGTAATACACTAAAAATCCAG 

AATCTTCAGTCATGTCAACATCGGATTCAAATGCGTGAGCTCTTCGTCGAGt 

TGTGTTTGACTGTGCCAGTTCGACTCAGTTCCCTTCTGCCATACCTACCGCT 

TCTGATGGATCCACTGGTGTGTGCGATGAATGGGAGTCCGAACATAGTTAC 

ACAAGGATTGAGAACATTGGAATTATGTGTGGATAACTTGCAACCTGAATAT 

CTTCTCGAAAATATGCTTCCTGTCCGTGGAGCTTTGATGCAAGGCCTCTGG 

CGTGTTGTATCGAAAGCTCCAGATACATCATCGATGACAGCAGCGTTCAGG 

ATCCTCGGAAAGTTCGGAGGAGCCAATCGAAAACTTCTGAATCAACCGCAA 

ATTCTTCMGTAGCCACTTTAGGCGACGTAAGTTTATTTAGTTTATTCTCTTC 

CTCGTTTTAAGTTCTAACATTGATCCTATTAACAGACTGTTCAGTCGTACATC 

AATATGGAATTCTCGCGGATGGGACTCGATGGCAATCACAGCATTCACCTG 

CCACTGTCCGAGTTGATGAGAGTCGTTGCCGATCAGATGAGATATCCAGCT 

GATATGATCCTTAATCCAAGTCCTGCAATGATCCCGTCAACTCATATGAAGA 

AATGGTGTATGGAATTGTCGAAAGCCGTCTTGTTAGCCGGACTTGGATCTTC 

AGGAAGCCCAATTACTCCAAGTGCAAATCTTCCGAAGATTATCAAGAAACTT 

CTTGMGATTTTGATCCAAACAATCGTACCACTGAAGTATACAeATGTCCGA 

GGGAAAGTGATCGAGAGCTTTTTGTGAATGCACTTCTCG CAATGG CTTGTAA 

GTTCnMGTTCTTTTCTCTCTAATCAGATCTATATTTTAA^TTTTTCAGACGG 

MTATGGMTAAAGACGGtTTCCGGCATGTCTATAGCAAATTCTTTATCAAA 

GTTCTCCGCCAGTTTGCGTTGATTGGAGTACTCGAATACATTGGTGGAAATG 

GATGGATGCGTCATGCAGAAGAGGAAGGTGTTCTACCATTGTGCCTTGACT 
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CGTCTGTTATGGTTGATGCTCTGATTATTTGTCTCTCTGAAACATCGTCAAG 

CTTCATCATTGCTGGTGTCATGTCTCTTCGTCATATCAATGAGACTCTCTCG 

CTTACACTTCCCGATATTGATCAAATGTCGAAAGTTCCAATGTGCAAATACTT, 

GATGGAGAAGGTGTTCAMTTGTGTCACGGGCCTGCTTGGTATGCAAGATC 

TGGTGC5AATCAATGCAATTGGATACATGATCGAATCGTTTCCACGAAAATTT 

GTTATGGACTTTGTGATAGATGTTGnGATTCGAtCATGGMGTTATTTTGG 

gaactgttgaagaaatatcaagtggatctgctgattctgcatacgattgtct 

caagaaaatgatgcgagtctatttcatcaaagaagaaggccaagaagagga 

gmtctgacactcgcgactatttttgtgtctgcmtctctaagcattacttgc 

acagtaatgamgagtcagagmttt gcg att ggttt aat gg atcattgt at 

ggttcactcaagacttgcaccatcccttgataagttctactatcgattcaag 

gagttctttgagccagaattaatgcgggtgctcacaacagttccaacaatgt 

cattggcagacgcaggaggaagtttggatggagttcaaaactatatgttca 

actgtccggatggttttgatttcgaaaaagatatggacatgtagaagcgata 

tttgtcacatctgctggatattgcacaaaccgatacatttaccttaaaccaaa 

ggaatgccttcaaaaaatgcgagacatgcccatcgcatttccttcctccatt 

cccaatcactacacatattgattcaatgcgagccagtgctctacagtgtctt 

gtgatcgcgtatgatcgaatgaagaagcaatacatcgacaagggaatagag 

CTGGGTGATGAGCATAAGATGATAGAGATCCTCGCACTTCGCAGCTCCAAG 

ATCACAGTTGATCAAGTCTACGAGAGCGATGAATCTTGGAGACGATTGATGA 

CAGTTCTATTGAGAGCAGTCACTGACAGAGAAACTCCTGAAATTGCGGAGA 

AGCTT.CATCCTTCACTTTTGAAGGTCTCACCAATATCCACAATCATCATCGCA 

ACATTTGGTGCTTCTTACATAAGAAATATTAGTGGAGCAGGAGATGACAGTG 

ATTCAGATCGTCATATTTCGTACAACGATATAATGAAGTTCAAGTGTCTCGTG 

GAGCTCAATCCAAAGATTCT6GTCACAAAAATGGCAGT GAATC TCGCAAATC 

AMTGGTTAAATATAAGATGAGTGACAAGATCTCTAGGATTTTGTCAGTTCC 

CAGTAGCTTCACTGAAGAGGAGCTCGATGATTTCGAAGCGGAGAAGATGAA 

AGGAATTCGAGAGTTGGATATGATTGGTCATACGGTTAAAATGCTTG CTGG A 

TGCCCAGTGACCACATTCACGGAGCAAATTATTGTGGATATCAGTCGTTTTG 

CTG CTCATTTT GAGT AT GCTTATTC G CAAG AT GTACTTGT AAATTG G ATTG AT 

GATGTCACAGTAATCCTCAACAAAAGTCCCAAAGATGTATGGAAGTTCTTCT 

TGTCTCGAGAATCAATTCTAGATCCTGCACGCAGATCCTTTATTCGAAGAAT 

CATAGTCTATCAATCAAGTGGTCCACTGCGACAGGAATTCATGGATACTCCG 

GAATATTTTGAGAAACTCATTGATCTTGACGATGA GGAG AATAAGGATGAAG . 

ATGAGAGAAAAATCTGGGATCGTGATATGTTTGCATTTTCGATTGTCGATCG 

TATCTCGAAGAGCTGCCCTGAGTGGCTTATTTCTCCGAATTCCCCAATTCCA 

AGAATTAAGAAGTTGTTCTCCGAAACGGAATTCAATGAGCGATATGTGGTTC 

6AGCATTGACTGAGGTGAAGAAATTTCAAGAAGAGATCATAGTGAAACGGA 

TG ACAGAG CAC AAGTACAAG GTTCCG AAGCT G ATTCT G AATACCTTCCTGA 

n at ATTT G AG GTA ATTTC AAGAT A GTTTGT AAAAATT AATT AC AAAG AAAT AT A 

CCAAAACTGAACO^ ' 

TATTnCTCGAAAMTCGTTCAAAATACCAAAAAATTCGAATTCTCACTTCTAA 

MTTATTTTTGAATTTTTAAATMTTTTTGAACATTTCTCTATGAAATTCATGTT 

ttgggcctatttcaggctataaaaattatttttctgattttaaataacttgcaa 
atttcaggctcaacatctatgactacgatctattcatcgttatcgcctcgtgt 
ttcaatggcaatttcgtcaccgatctctcttttcttcgcgaatatcttgaaac 
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TGAAGTCATCCCGAAAGTGCCGTTACMTGGCGGAGAGAGCTGTTTCTTCG 
AATTATG CAGAAGTTTGATACGGATCCACAAACTGCTGGAACAAGTATGCAG 

catgtgaaggcccttcaatatttgg™ttcccacgttgcattgggcgttgg 

agcgatat6atacggatgaaattgttggcaccgcaccaatagatgattc6g . 

attcttcgatggatgtagatccggcaggcagctcggataaccttgtggctc 

gtttaacatcagtcattgattctcatcgtaattatctgagcgatggaatggt 

cattgttttctatcmctttgcacattgttcgtacaaaacgcctccgaacata 

ttgacaataataactgcaagaaacaaggtggacgcctacggatcctgAtgct 

cttcgcctggccgtgcctgaccatgtacaatcatcaagatccaacaatgcg 

gtacactggattcttcttcttggccaatattatagagcgtttcacaattaatc 

ggaaaatcgtgcttcaagtgttccatcaacttatgactacttatcagcagga 

cactagagatcaaatccggaaagccattgatatattaactccagctttgagg 

acacgaatggaagatggacacttgcaaatattgagtcatgtgaagaaaattc 

ttatcgaagmtgccataatttgcmcatgttcagcatgttttgtaagtttat 

tatctaamtgattttttttmtgttaaamtttmttttaam 

ctcctttaataattcctgaattttccagccaaatggtggttcgcaattatcgt 

gtctactatcatgttcgattggagcttctcacgcctcttctgaacggagttc 

aacgagcacttgtgatgccaaatagtgttctggaaaagtaagtttccagccc 

gttgttcgtaaactcaccccttgtaaatatttagctggcaaactcgacgtca 

tgcggtggagatctgcgagat.ggtgatcaagtgggaattgttcagaacgct 

gaaaacagatcatattatcagtgacgaagaagctctcgaagttgacaagcaa 

ttggataagctgcgaacagcttcatccacagatcgtttcgatttcgaggag 

gctcataacaagagagacatgcctgatgctcaacgcacgattatcaaaGag 

cacgccgatgtgattgtcaatatgcttgtccgattctgtatgacgttccatc 

agaattcgggttcttcgtccacttctcaaagtgggaaccatggtgtcgagtt 

gaecaaaaaatgtcagctgcttctacgtgcagccctacgaccaagcatgtg 

gggagmtttgtcagcttccgatrmcaatgatcgaaaagtttttgtcaatt . 

ccgaatgataatgctctacgcaatgatataagttctacggcctacgctaata 

ctatccaaaatgcacaacacactctggatatgctgtgtaatattattcctgtt 

atgccaaaaactagcttgatgactatgatgagacaactccaacggccactca 

tacaatgtctcaataacggagctcaggtatgtgaagaacgatgaatagggg 

gttataaatcactaatttctcttagaactttaagatgactcgtcttgtcactc 

aaattgtcagtcggttactcgaaaagacaaatgtttcggttaacgggcttga 

tgagctggagcaattgaatcaatacatttcccgattcctacatgaacatttt 

ggatctcttttgaagtaagttttatttttgaatttccatctttcaacccttcgc 

cagttgcagaaacttgagtggaccagtgttgggagttctcggagcattttc 

tcttttgcgaacaatttgtggacacgagccagcatacttggatcatttgatg 

ccttcatttgtaaaagtgatggagagagctgcaaaagagcacttggcgtat 

gttgcgaactcgcaagatggaaatatggtgaagagtaagttctataaaaaga 

TTCAGATTTTCTA„ATCCGCTTAGATJJCIJJ.CCAGATGTTGCTGAATTGTTGT 

GTGCATGCATGGAGCTGGTACGTCCCAGAGTCGAfCATATCAGfAtGGAGA 

TTAAGAGATCAATTGTTGGTGGTATTATCGCGGAGCTGATTATCAAATCGAA 

TCACGATAAGATCATCCAGACGTCAGTGAAGCTTCTCGGAGCAATGATTAG 

CACGCAGGATATGGAATTTACAATTCTCACTGTTCTTCCGCTACTTGTTCGT 

ATCCAATCAATTATTGTGACCAAGTTCAAGAATTGCAAGGATCTGATAGCAG 

ACTATCTTGTTGTGGTTATTACCGTTTTTGAGAACAGCGAATATCGGAACTC 
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GGAAGCTGGATCTCGTCTCTGGGAAGGATTCTTCTGGGGACTCAAGAGTAG 
CGATCCTCAAACCCGGGAGAAATTCTCGATAGTTTGGGAGAAGACTTGGCC 
ACACATGGCAACAGTAGATATTGCTCATCGAATGAAATATATCATGCAAAAT 
CAAGATTGGTCCAAGTTCAAACACGCGnTTGGTTGAAATTCGCACTTTGGG 
GAATGC.TACGAACGATTGCCAAACGGCCAACTGATCCGAATAATAAGAGAA 
AGAAAGTGATACTGTTGAAGTGTGCMCTCCATGGAGAACAATTGAATATGC 
AGCGAAATTGAAGGAfCAGCCAATGGAAGTGGAAACTGAAATGAAACGAGA 
AGAGCCAGAAGCGATGGAAGTTGACGAAAAAGACTGGCAAGATGATTCTAA 
GGATGCCGGAGAGCCCAA6GAGAAGGAAAAGCTCACATTG GAATT ATTGGT 
TGCTGGACMCMGMCrrTTGGATGAAGCTTCCAATTATGATTTTGCGGAT • 
GCTCTAGATACAGTATCCCAGATTACATTTGCACTTAATGGTAAATTGTTCAA 
AGTTTATGAATATTTTTCTTAAAAATCACAATTTTCAGAGAATCAAGTGACAA 
GCAAGATGTGGGTAGTGTTGTTCAAATCATTCTGGAGTTCCTTATCACAATC ' 
CGAAATCGAAGATTTCACGGCGCTAGTCGTTCCGTTTATGAGCAGTGGAGT 
"GCATAATAATTATCAGACGGGTGTACAGGATAGT-GTGCTTGCTGTTTGGCTT 
G AAG CT GTTGGT G ACGCT GTT CATTTGC CGTC CAG ATTG ATTG AGGTACGTT 
CTGAAAATGAATGCTGGAAAAAATTCGATTTTTCTGTTTAAAAAAAGTTAAAA 
TTTCCGATTTTTTGMTAGCAAAAAAAAMGAAAACATTTATTTTGAAAAAAGA. 
GTCCTCACCGGAATTTTTTMTAMTAMTTTAAAAAAAGAAAAAAAACTAAA 
AACTTCMTTTTTGAAMTCAAAAAAAAMTTACAGAAAC AGAC GAGGTAAAA 
AATmAAAAAAGnCTGTAAAAAAAATGGAGMTCACAGTTT TCGTT GTCTT 
TTCTGAAAAAAATTTGAAAAATTAAAAATTAACGAri I II IGGI I I I IAATTTA 
AAAAMTATACGAAAAMGACTGAAGMCTTTTTTTGTCAAAAAA ACTT G'ATT 
TT6ATGAGG6AAAAAGTTCAAAMCTTGGAGAAATCATCGG AAATTT TAGAA 
GATT.CMT-AAAAATTTCCAAAAAAAAAAATTGAACATTTATGA I I I I IGGGTAT 
TTTGAAAMTTGAAAAATTACGCTTAATTTTTAGATTAAAAAAATCAAAAAAAA 
ACCAACACTCCTTTTGAAACTTGACACTTTTGAAACG I I I I I I I I I I I IGCAAT 

mtamtttctcatttcagtttatctcatcaaaacacgaatgctggcataccg 

gmtcaggcttctcgagaatcatatatggacaattccaaagcaactcaacaa 

cacgttactccgagaaatgaaagtggcaccaggtctcgctggagatattga 

gacactcgaatctcttggaacactctacaatgagatatcagagtttgatcag 

ttcgctggaatctgggaacgccgtgctgtatttcctgatacgatgagagca 

atgtcagctatgcaattgggagatatggaattagctcaatcttatctggaaa 

aatcaatgagcagtacgtatgaaactcttgctccgacaatcaatcgtaagtt 

t g gat caat cg gttgtacttctc ac ac aaaat agt attcctttcagcaaac aa 

cacttcaaattcggagaagcatgtttctccgattattgacaaagaatacgat 

cattggatggagatgtacatcacaaattgctcggagcttcttcagtggcaaa 

atgtggccgacgtatgcaatggcaaagacatgcaacatgttcgtggcctga 

tcaacgcagcatctcacattccggactggaatgtggtcgaggagtgtaaaa 

gfcagatagctg6atgtattcca€caagtttccatttagattacacrctttto 

mtttgatgagtactgttatggttagtitmgtcaa 

jtgtttaatttttcagcgaatgaatgaaaactcaagcccgacacatatgaag 

gaacgatgcaaaattgcaattcaagagtgcacagaagctcatattagtcgtt 

ggagagcacttccgtcagttgtttcatatggtcatgtcaagattcttcaggc 

aatgaacttggttcgagaaattgaagagtctacagatattcgcattgctctg 

ctcgaggccccatcaaacaaagtggatcaggcgttgatgggcgatatgaag 
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g6aagctggatctcgtctctgggaaggattc7tctggggactcaa6agtag 
cgatcctcaaacccgggagaaattctcgatagtttgggagaagacttggcc 
acacatggcaacagtagatattgctcatcgaatgaaatatatcatgcaaaat 
caagattggtccaagttcaaacacgegttttggttgaaattcgcactttggg 
gaatgctacgaacgattgccaaacggccaactgatccgmtmtaagagaa 
agaaagtgatactgttgaagtgtgcaactccatggagaacaattgaatatgc 
agcgaaattgaaggatcagccaatggaagtggaaactgaaatgaaacgaga 
agagccagaagcgatggmgttgacgaaaaagactcgcaagatgarrctaa 
ggatgccggagagcccaaggagaaggaaaagctcacattggaattattggt 
tgctggacmcmgaacttttggatgaagcttccaattatgattttgcggat • 
gctctagatacagtatcccagattacatttgcacttaatggtaaattgttcaa 
agtttatgmtatttttcttaaamtcacmttttcagagaatcaagtgacaa 
gcaagatgtgggtagtgttgttcaaatcattctggagttccttatcacaatc ' 
cgaaatcgaagatttcacggcgctagtcgttccgtttatgagcagtggagt 
"gcataataattatcagacgggtgtacaggatagt-gtgcttgctgtttggctt 
gaagctgttggtgacgctgttcatttgccgtccagattgattgaggtacgtt 
ctgaaaatgaatgctggaaaaaattcgatttttctgtttaaaaaaagttaaaa 

TTTCCGATTTTTTG MTAG CAAAAAAAAMGAAMCATTTATtTTGAAAAAAGA 

GTCCTCA CCGG AATTTTTTAATAAATAMTTTAAAAAAAGAAAAAAAACTAAA 

AACTTCM I I I l I GAAAATCAAAAAAAAAATTACAGAAACAGACGAGGTAAAA 

AATTTTAAAAAAGTTCTGTAAAAAAMTGGAGAATCACAGTTTTCGTTGTCTT 

XTCTGAAAAAAATTTGAAAAATTAAAAATTAACGATTTTTTGGTTTTTMTTTA 

AAAAMTATACGAAAAMGACTGAAGMCTTTTTTTGTCAAAAAAACTTGATT 

TTGATGAGGGAAAMGTTCAAAAACTTGGAGAAATCATCGGAAATTTTAGAA 

GATTCMTAAAAATTTCCAAAAAAAAAAATTGMCATTTATGATTTTTGGGTAT 

TTTGAAAMTTGAAAAATTACGCTTAATTTTTAGATTAAAAAAATCAAAAAAAA 

ACCMCACTCCTTTTGAAACTTGACACTTTTGAAACG I I I I I I I I 1 1 I IGCAAT 

AATAAATTTCTCATTT CAGTTTAT CT C ATCAAAAC ACG AATGCTG GCATACCG 

GAATCAGGCTTCTCGAGAATCATATATGGACAATTCCAAAGCAACTCAACAA 

CACGTTACTCCGAGAAATGAAAGTGGCACCAGGTCTCGCTGGAGATATTGA 

GACACTCGAATCTCTTGGAACACTCTACAATGAGATATCAGAGTTTGATCAG 

TTCGCTGGAATCTGGGAACGCCGTGCTGTATTTCCTGATACGATGAGAGCA 

ATGTCAGCTATGCAATTGGGAGATATGGAATTAGCTCAATCTTATCTGGAAA 

AATCAATGAGCAGTACGTATGAAACTCTTGCTCCGACMTCAATCGTAAGTT 

TGGATCMTCGGTTGTACTTCTCACACAAAATAGTATTCCTTTCAGCAAACAA 

CACTTCAMTTCGGAGAAGCATGTTTCTCCGATTATTGACAAAGAATACGAT 

CATTGGATGGAGATGTACATCACAAATTGCTCGGAGCTTCTTCAGTGGCAAA 

ATGTGGCCGACGTATGCAATGGCAAAGACATGCAACATGTTCGTGGCCTGA 

TCAACGCAGCATCTCACATTCCGGACTGGAATGTGGTCGAGGAGTGTAAAA 

GTCAGATAGCT.GGATGTATTCCACCAAGJTTCCATTTAGATTACACTCTTTTC 
MTTTGATGAGTACTGTTATGGTTAGTTTAAOT 

TTGTTTMTTTTTCAGCGAATGAATGAAAACTCAAGCCCGACACATATGAAG 

GAACGATGCAAAATTGCAATTCAAGAGTGCACAGAAGCTCATATTAGTCGTT 

GGAGAGCACTTCCGTCAGTTGTTTCATATGGTCATGTCAAGATTCTrCAGGC 

AATGAACTTGGTTCGAGAAATTGAAGAGTCTACAGATATTCGCATTGCTCTG 

CTCGAGGCCCCATCAAACAAAGTGGATCAGGCGTTGATGGGCGATATGAAG 
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tcgttgatgaaagtattccgaaatagaacaccaaccacttcggatgatatgg 

gattcgtttggacttggtatgattggaggaatcagattcatggaatgatgct 

tcaaagattcgaatattgggataaagtaggactcaacgtcgctgcaactgga 

aaccagtcaattgtt6cgattcattcaatggctcaagcacagttggccgtag 

ccaaacatgccaagaatettggattccataatttmcgaaagatctactcaa 

caaattagctggattga.cagccataccgatgatggatgctcaagataaagtt 

TGCACTTACGGCAAGACACTTCGCGATATGGCAAAC AGTGC GGCTGACGAA - 

agagtgaaaaatgagctattgtgtgaagcgcttgaagttttggaagatgtgc 

gaattgatgatctacagaaggatcaggttgctgcattgctttatcatcgtgc 

taatattcattcagttcttgatcagtmgttttcaatgccgaaaaaaaattaa 

agttttacaaaaataaatttcagagctgaaaatgctgactacaccttctccgc 

agcctctcaacttgtcgacttgcaaaatagtgtgacaaccactggaatcaag 

ctcatgaaaaattggggccaccatctttacaagagattcttctctacgacag 

tttgcaaggaaaccggaaacaacttcggacggcaggctctcgcttgttact 

tcattgcggctcgtgtggataacgatatcaaggcgagaaaaccgattgcca 

agattttgtggctctcgaagcacttgaatgcgtgtggatcacatgaagtgat 

gaatcgggttattaagaagcaacttcattcacttaatctcttcaattggcttt 

actggcttccacaattggttactgatgttcgatataaacc aaatt cgaacttt 

g tt ct g att ctct g c aag g t aagtttt g aaat attt aaat atttt c ag aatttt 

aaatgamttcatttgcagatggctgctgctcatccacttgaagtattttacc 

acattcgggaggcagttagcgttgacgatattgactcggttctcgaagaag 

attacactgatgagcaaatgtcgatggatgtttcggatgaggattgttttgg 

agacgatccaccatttgatagaattctgaaaatatgtctgaaatatcgtccaa 

ctgatattcgagtcttccatcgtgtcctcaaagaacttgacgagatgaatga 

gacatgggttgaacgtcacttgcgtcatgcgat.ctgcctcaaggatcagat 

gttcamgatttctcggaacaaatggacgcgacgttcaatgagatgcaatat 

tcggAggatgtgactatgatgacgttgagatggaggaaacagctggaagaa 

gacttggtgtatttccmgagaattataatcttgatttgctggagattcgtaa 

caagcgaaagatgatcgtgacgaagggatgtatgggagtcgagaaaagtca 

gataatgttcgaaaaagagctgagtcaagtgttcacagagccggccggcat 

gcmgatgaatttgattttgtcacaaatatgactaatatgatggtctcacagt 

tggatattcatgcagtggatgctccacgccctcagggatatattcgtattgt 

tctcgactggattcgagcgattcgtcgtcgtttcgatcgacttccacgaag 

aatccctctggaatcgtcaagcccatatctcgccagattcagccatcgtaca 

ggatgcatcgaaatgccatacgatttgctcaacgttttgcgcgccaagaat 

catactctgatg gcttccaatcaaacg gg g caatacatatccatgctctctc 

gatttgagccaaactttgagattgtgatcaaaggtggtcaagtgataagaaa 

gatctatattcgaggacaaaccggaaagagtgcggcgttttatctgaagaa 

atctgtgcaggatgagccaactaaccgagttccacaaatgttcaaacatctt 

GAT.CACGTTeTACAAAe.CGATAi3AjGAGiJCGG 
CCMCAGTGCTGCAGATGAGAGTCGGACAGM^ 

GCATCCGTTCAACCATATGCAATGCCACCGGATTGTACCAGAAACTATCCAG 
CATCACAAATCGACATTGTTCATCCATATGATGTGCTGACTGGCACTTTGAAT 
GGAAGTTATTATCCGGATGATATGGTATTGCACTTCTTTGAGAGATTCGCCC 
AAAGTTCTTCATCCATCGGACAACCTCTTCCAACTCCGACGAACCAAGATGG 
AACAGTTGCTCCGCCACGACTAACGGAAGCTCACCACATCAAGAATATTATT 
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tatgagtacgtttgagaagctagtgtctaaaataatmttmtgtaaaaaaAt 

tttcagagactttgccegagatatgatcccattccgacttctctaggactac 

ctcactgcacgatatcctgatccggttatgtactatgcaatgaagaagcaat- 

tgctgcacagtctcgccgtcctatccacaatcgaatatcattgcaatctgac 

accaatgggacctgatcaaatgatgatgacaatgaatactggagtccttagc 

aatccttcatatagattggaaatccgaggaggacgatcacttcatgatattc 

aacactttggacatgaagttccattccgattgactccamtctatcgattttg 

gttggtgttgcacaggatggtgacttgttatggagtatggctgctgcgtca 

aaat gtttg at g aag aag g aacct g aagtt at cat g ag accgtt agt at g gg 

atgaattcgccaacmtacagattgcgacamtcggtmttttactttaatat 

gctmtagggmttgmctmtgttttccaagcgtttgcaggtattcgcgtg 

tcatgcatcgaattcttacatcaatggtgtcgcgagcaagcttcgaaacacg 

aatagcgccgacgccaaactcagaaaggacgattgtgtgtcgctgatcagt 

CGAGCCAAGGATTCGGATAATCTGGCCCGAATGCCACCCACCTACCACGC 
GTGGrrcH^ TCTCATM ™ CCG ^ CTCTA " n ^ GATCCCGCCTC CCACTC 

tcacagatctctatacatttgtcaaatgtttccaaatcttttatctgcccata 
c att c g ttttt att g tttt gtttcttttcttt cttt attt ctttt ct aaacttt a 

agatttatgtaaatatttmctgcgctggtatttatgaaaaattcagataaag 

ttttcaagttraaaamtcgaaaattcgaagtcggaagttctcttacaggtgt 

agTaAgtaggcacaatggcaataggtacatggaagg'cttgcggaaggcaca 

tgggtaggcataagatcgaaaaataagctgatatataaatatagataggtat 

tggttaggcacaaattaggcacgtaggtgtgagctggcaaataggtaggca 

tgacgttcggcaaatcggcaaattgccgatttggcgaaaattttcaaatccg 

gcgatttgccggamtgtttagagamttttttatmgacagaaaaacttaca 

actgtgtctttttgamttcttccggttttctttatacagtgcgtgcaacttc 

tatagcg cccc ccccccccccccccccccctattttttggcgtttcacgcc 

AnCTGATTTTTAllt'TTCTGATTlT^ 

GGATGCTTGGAGAGAAATATCAGCCAGCAAAATAAAGAATCTGGTCAACTCA 

ATGTCGMTAGATTTTTTGAGGTTATCGTTAAGAAGGGAGGTCCCACGACGT 

ATTGATCCTTCATCGAGTTAACAAATTATGATGTTTTAATTGATTTCATTCCAC 

TTCTGGACACAGAAGGACGAATAGTGCAATCTGGTACAAGTTTATCACCACC 

TACAACTTCGTCGATTTGTGGAAAATCTTTCAGACATGTCTCCATGAGTGTC 

TCAGAACATCTTGGTCAGGTTTGGAGTCGATCCCACCGCTGGGAGCCGAGA 
ATGGGCCTCTAACAC 
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irr-1 ORF sequence 

ATGGATCCGGCTATGGGTTCTCCAGGCTATCGGTCTGTGCAGTCCGATCGG 
AGTAAT CACCT AACAGAGCTGG AAACGAGAATT CAAAATCTTGCCGAT AATT 
CACAAAGAGATGATGTCAAATTGAAAATGTTACAAGAGATTTGGAGCACAAT • 
CGAAAATCATTTCACACTAAGTTCGCACGAGAAAGTCGTGGAGAGGCTCATT 
CTCTCGTlCCTACAAGTTTTCTGCAACACAAGTCCACAGTTCATTGCTGAAA 
ACAATACACAACAGCTTCGAAAGTTAATGCTTGAAATCATTCTTCGACTTTCG 
AACGTAGAAGCCATGAAACATCATAGCAAAGAAATTATCAAGGAGATGATGA 
GGCTAATCACCGTGGAAAATGAGGAGAATGCCAATTTGGCTATCAAAATTGT 
CAGCGATCAAGGGAGAAGTACCGGCAAAATGCAATATTGCGGAGAGGTTTC 
ACAGATAATGGTCTCCTTCAAAACAATGGTCATTGATCTGACGGCGAGTGGT • 
CGAGCTGGTGATATGTTCAACATAAAAGAGCATAAAGCTCCACCGtCAACTA 
GCTCCGACGAGCAAGTCATCACTGAATATTTGAAGACTTGCTACTATCAACA 
AACGGTTCTTCTCAACGGAACGGAAGGAAAACCGCCATTAAAATACAATATG 
ATTCCATCAGCTCATCAGTCAACGAAGGTGCTCCTGGAGGTTCCGTATCTC 
GTGATTTTCTTCTATCAACATTTCAAAACAGCGATCCAAACCGAAGCGCTTG 
ATTTCATGAGGCTTGGTCTTGATTTTCTAAATGTCAGAGTTCCAGACGAGGA 
TAAACTCAAAACAAATCAMTAATAACCGATGATTTTGTCAGTGCACAGTCCC 
GATTCCTGTCATTCGTCAACATTATGGCTAAGATTCCAGCGTTTAtGGATCTT 
ATCATGCAAAATGGACCGCTTCTAGTGTCGGGAACAATGCAGATGCTCGAG 
CGGTGCCCGGCTGATCTGATAAGTGTCCGACGAGAAGTTCTGATGGCTTTG 
AAGTATTTCACATCTGGAGAAATGAAGTCGAAATTCTTTCCAATGCTACCTC 
GACTCATCGCTGAGGAGGTTGTTGTGGGAACAGGATTCACTGCGATTGAGC 
ATTTGCGAGTTTTCATGTATCAAATGCTAGCAGATCTGTTGCATCACATGCG 
• AMTTCTATAGACTATGAAATGATCACACACGTGATTTTCGTATTCTGTCGCA 
CTCTTCACGATCCTAACAACTCTTCTCAAGTCCAGATTATGTCTGCTCGGCT 
GCTCAACTCACTGGCCGAATCTCTGTGCAAAATGGATTCACATGAtACCTTT 
CAGACTCGTGATCTGCTCATTGAAATCCTGGAGTCGCACGTGGCCAAGCTC 
AAAACTCTTGCAGTCTATCACATGCCTATTCTCTTCCAACAATACGGAACCG 
AAATAGACTACGAATACAAAAGTTATGAGAGAGACGCCGAGAAACCTGGAA 
TGAATATCCCAAAGGACACTATACGAGGAGTACCGAAACGAAGAATCCGTC 
GGCTCTCCATTGATTCAGTTGAAGAGCTGGAATTCCTGGCATCAGAACCATC 
CACGTCGGAAGATGCAGATGAGAGTGGTGGAGATCCGAACAAGCTTCCTCC 

gccmcaaaagagggaaagaaaacgtctcccgaagcgattttaaccgccat 

gtcaacgatgacacctcctccattggcaattgttgaagctcgaaatcttgtg 

aagtatataatgcatacgtgtaaattcgtgacaggacaattgagaatcgccc 

ggccatcacaggatatgtatcattgttcgaaggagcgagatttattcgaacg 

tcttctacgatatggtgtaatgtgtatggatgtattcgtgcttccaacaact 

cgaaatcaaccacaaatgcattcttcaatgcggacaaaagatgagaaagatg 



GGAMTCTTCGAAAAGTATATGGATTTCTTGATTGAAAGAATTTACAATCGGA 
ACTATCCATTGCAATTGATGGTGAACACCTTCTTGGTTCGAAATGAAGTGCC 
ATTCTTCGCATCTACGATGCTTTCATTCTTGATGTCTCGAATGAAATTGCTGG 
MGTTAGCAATGACAAGACGATGCTATATGTGAAGCTCTTCAAAATTATCTTC 
TCCGCCATCGGAGCCAATGGCTCTGGGCTTCATGGAGATAAAATGCTCACT 
TCATACCTCCCAGAGATTCTCAAACAGTCAACTGTCTTGGCATTAACAGCTC 
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GTGAACCTCTCMCTATnCCTTTTGCTTCGTGCATTGTTCCGCAGTATrGGT 

GGTC3GCGCTCAGGATATTTTGTATGGAAAGTTCCTGGAGTTACTGCCAAATC 

J7CTTCAATTCTTGAATAAATTGACGAATCTTCAGTCATGTCAACATCGGATT 

CAAATGCGTGAGCTCTTGGTCGAGTTGTGTTTqACTGTGCCAGTTCGACTCA 

GITTCCCTTCTGCCATACCTACCGCTTCTGATGGATCCACTGGTGTGTGCGAT 

GAATGGGAGTCCGAACATAGTTACACAAGGATTGAGAACATTGGAATTATGT 

GTGGATAACTTGCAACCTGMfATCTTCTCGAAAATATGCTTCCTGTCCGTG 

gagctttgatgcaaggcctctggcgtgttgtatcgaaagctccagatacat 

catcgatgacagcagcgttcaggatcctcggaaagttcggaggaggcaatc 

gaaaacttctgaatcaaccgcaaattcttcaagtagccactttaggcgacac 

tgttcagtcgtacatcaatatggaattctcgcggatgggactcgatggcaat 

cacagcattcacctgccactgtccgagttgatgagagtcgttgccgatcag 

atgagatatccagctgatatgatccttaatccaagtcctgcaatgatcccgt 

caactcatatgaagaaatggtgtatggaattgtcgaaagccgtcttgttagc 

cggacttggatcttcaggaagcccaattactccaagtgcaaatcttccgaa 

gat^atcaagaaacttcttgaagattttgatcca aacaa tcgtaccactgaag 

jatacacatgtccgagggaaagtgatcgagagctttttgtgaatgcactrct 

cgcaatggcttacggaatatggaataaagacggtttccggcatgtctatag 

caaattctttatcaaagttctccgccagtttgcgttgattggagtactcgaa 

tacattggtggaaatggatggatgcgtcatgcagaagaggaaggtgttcta 

ccattgtgccttgactcgtctgttatggttgatgctctgattatttgtctctc 

tgamcatcgtcaagcttcatcattgctggtgtcatgtctcttcgtcatatc 

aatgagactctctcgcttacacttcccgatattgatcaaatgtcgaaagttc 

caatgtgcaaatacttgatggagaaggtgttcaaattgtgtcacgggcctg 

cttggtatgcaagatctggtggaatcaatgcaattggatacatgatcgaatc 

gtttccacgaaaatttgttatggactttgtgatagatgttgttgattcgatca 

tggaagttattttgggaactgttgaagaaatatcaagtggatctgctgattc 

tgcatacgattgtctcaagaaaatgatgcgagtct atttc atcaaagaagaa 

ggccaagaagaggagaatctgacactcgcgactatttttgtgtctgcaatct 

ctmgcattacttccacagtaatgaaagagtcagagaatttgcgattggttt 

aatggatcattgtatggttcactcaagacttgcaccatcccttgataagttc 

tactatcgattcaaggagttctttgagccagaattaatgcgggtgctcacaa 

cagttccaacaatgtcattggcagacgcaggaggaagtttggatggagttc 

aamctatatgttcmctgtccggatggttttgatttcgaaaaagatatgga 

catgtacaagcgatatttgtcacatctgctggatattgcacaaaccgataca 

tttaccttaaaccaaaggaatgccttcaaaaaatgcgagacatgcccatcgc 

atttccttcctccattcccaatcactacacatattgattcaatgcgagccagt 

gctctacagtgtcttgtgatcgcgtatgatcgaatgaagaagcaatacatcg 

acaagggaatagagctgggtgatgagcataagatgatagagatcctcgcac 

ttcgcagctccaagatcacagttgatcaagtctacgagagcgatgaatcttg 

gagacgattgatgacagttctattgagagcagtcactgacagagaaactcc 

tgaaatrgcggagaagcttcatccttcacttttgaaggtctcaccaatatcc 

acmtgatcatcgcaacatttggtgcttcttacataagaaatattagtggag 

caggagatgacagtgattcagatcgtcatatttcgtacaacgatataatgaa 

gttcaagtgtctcgtggagctcaatccaaagattctggtcacaaaaatggca 

gtgaatctcgcaaatcaaatggttaaatataagatgagtgacaagatctcta 
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GGATTrTGTCAGTTCCCAGTAGCTTCACTGAAGAGGAGCTCGATGATTTCdA 
AGC G GAGAAG ATGAAAGG AATTCGAG AGTTGGAT ATG ATTG GTC ATACGGT 
TAAAAT G CTT GCTGGATGCCCAGTG ACCACATTC ACGG AGC AAATTATTGTG 
GATATCAGTCGTTTTGCTGCTCATTTTGAGTATGGTTATTCGCAAGATGTACT 
TGTAAATTGGATTGATGATGTCACAGTAATCCTCAACAAAAGTCCCAAAGAT 
GTATGGAAGTTCTTCTTGTCTCGAGAATCAATTCTAGATCCTGCACGCAGAT 
CCTTTATTCGAAGAATCATAGTCTATCAATCAAGTGGTCCACTGCGACAGGA 
ATT CAT G GAT AGTCCGG AATATTTTGAGAAACTCATTGATCTTG ACGAt GAG 
, GAGMTAAGGATGAAGATGAGAGAAAAATCTGGGATCGTGATATGTTTGCAT 
TTTCGATTGTCGATCGTATCTCGAAGAGCTGCCCTGAGTGGCTTATTTCTCC 
GMTrCCCCAATTCCAAGAATTAAGAAGTTGTT.CTCCGAAACGGAATTCAAT 
GAGCGATATGTGGTTCGAGCATTGACTGAGGTGAAGAAATTTCAAGAAGAG 
ATCATAGTGAAACGGATGACAGAGCACAAGTACAAGGTTCCGAAGCTGATT 
CT G AATACCTT CCT G AG ATATTTGAGGCTC AACATCTATGACTACGATCT ATT 
CATCGTTATCGCCTCGTGTTTCMTGGCMTTTCGTCACCGATCTCTCTTTTC 
TTCGCGMTATCTTGAAACTGAAGTCATCCCGAAAGTGCCGTTACAATGGCG 
GAGAGAGCTGmCTTCGAATTATGCAGAAGTTTGATACGGATCCACAAACT 
GCTGGMCAAGTATGCAGCATGTGAAGGCCCTTCAATATTTGGTTATTCCCA 
CGTTGCATTGGGCGTTCGAGCGATATGATACGGATGAAATTGTTGGCACCG 
CACCAATAGATGATTCGGATtCTTCGATGGATGTAGATCCGGCAGGCAGCT 
. CGGATMCCTTGTGGCTCGTTTAACATCAGTCATTGATTCTCATCGTAATTAT 
CTGAGCGATGGAATGGTCATTGTTTTCTATCAACTTTGCACATTGTTCGTAC 
AAAACGCCTCCGAACATATTCACAATAATAACTGCAAGAAACAAGGTGGACG 

cctacggatcctgatgctcttcgcctggccgtgcctgaccatgtacaatca 

tcaagatccaacaatggggtagactggattcttcttcttggccaatattata • 

gag cgtttcacaattaatcgg aaaat cgt g cttcaagtgttccatcaactta 

tgactacttatcagcaggacactagagatcaaatccggaaag;ccattgatat 

att aactccagctttgaggacacgaatggaagatggacacttgcaaat attg 

agtcatgtgmgaaaattcttatcgaagaatgccataatttgcaacatgttca 

gcatgttttccaaatggtggttcgcaattatcgtgtctactatcatgttcgat 

tggagcttctcacgcctcttctgaacggagttcaacgagcacttgtgatgc 

caaatagtgttctggaaaaatttagctggcaaactcgacgtcatgcggtgg 

agatctgcgagatggtcatcaagtgggaattgttcagaacgctgaaaacag 

atcatattatcagtgacgaagaagctctcgaagttgacaagcaattggataa 

gctgcgmcagcttcatccacagatcgtttcgatttcgaggaggctcataa 

caagagagacatgcctgatgctcaacgcacgattatcaaagagcacgccga 

tgtgattgtcaatatgcttgtccgattctgtatgacgttccatcagaattcg 

ggttcttcgtccacttctcaaagtgggaaccatggtgtcgagttgaccaaa 

aaatgtcagctgcttctacgtgcagccctacgaccaagcatgtggggagaa 

xttgtcagcttccgattaacaatgatcgaaaagtttttgtcaattccgaatga 

taatgctctacgcaatgatataagttctacggcctacgctaatactatccaa 

aatgcacaacacactctggatatgctgtgtaatattattcctgttatgccaaa 

aactagcttgatgactatgatgagacaactccaacggccactgatacaatgt 

ctcaataacggagctcagaactttaagatgactcgtcttgtcactcaaattg 

tcagtcggttactcgaaaagacaaatgtttcggttaacgggcttgatgagct 

ggagcmttgaatcaatacatttcccgattcctacatgaacattttggatctc 
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t7ttgmttgcagaaacttgagtggaccagtgttgggagttctcg6agcatt 
ttctcttttgcgmcmttrgtggacacgagccagcatacttggatcatttg 
atgccttcatttgtaaaagtgatggagagagctgcaaaagagcacttqgcg 
•tatgttgcgaactcgcaagatggaaatatggtgaagaatttctttccagatg 
ttgctgaattgttgtgtgcatgcatggagctggtacgtcccagagtcgatc 
atatcagtatggagattaagagatcaattgttggtggtattatcgcggagct 
gattatcaaatcgaatcacgataagatcatccagacgtcagtgaagcttctc 
ggagcaatgatragcacgcaggatatggaatttacaattctcactgtrcttc 
cgctacttgttcgtatccaatcaattattgtgaccaagttcaagaattgcaa 
ggatctgatagcagactatcttgttgtggttattaccgtttttgagaacagc 
gaatatcggaactcggaagctggatctcgtctctgggaaggattcttctgg 
ggactcaagagtagcgatcctcaaacccgggagaaattctcgatagtttgg 
gagaagacttggccacacatggcaacagtagatattgctcatcgaatgaaat 
atatcatgcaamtcaagattggtccaagttcaaacacgcgttttggttgaa 
attcgcactttggggaatgctacgaacgattgccaaacggccaactgatcc 
gaataataagagaaagaaagtgatactgttgaactgtgcaact.ccatggaga 
acaattgaatatgcagcgaaattgaaggatcagccaatggaagtggaaact 
gaaatgaaacgagaagagccagaaccgatggaagttgacgaaaaagactcg 
caagatgattctaaggatgccggagagcccaaggagaaggaaaagctcaca 
ttggmttattgcttgctggacaacaagmcttttggatgaagcttccaatt 
atgattttgcggatgctctagatacagtatggcagattagatttgcacttaat 
gagaatcaagtgacaagcaagatgtgggtagtgttgttcaaatcattctgga 
gttccttatcacaatccgaaatcgaagatttcacggcgctagtcgttccgtt 
tatgagcagtggagtgcataataattatcagacgggtgtacaggatagtgt 
gcttgctgtttggcttgaagctgttggtgacgctgttcatttgccgtccag 
attgattgagtttatctcatcaaaacacgaatgctggcataccggaatcagg 
ctt ctcgag aat c at at atg g ac aattc c aaagc aactc aac aac acgttac 
tccgagaaatgaaagtggcaccaggtctcgctggagatattgagacactcg 
aatctcttggaacactctacaatgagatatcagagtttgatcagttcgctgc 
aatctgggaacgccgtgctgtatttcctgatacgatgagagcaatgtcagc 
tatgcaattgggagatatggaattagctgaatcttatctggaaaaatcaatg 
agcagtacgtatgaaactcttgctccgacaatcaatccaaacaacacttcaa 
attcggagaagcatgtttctccgattattgacaaagaatacgatcattggat 
ggagatgtacatcacaaattgctcggagcttcttcagtggcaaaatgtggc . 
cgacgtatgcaatggcaaagacatgcaacatgttcgtggcctgatcaacgc 
agcatctcacattccggactggaatgtggtcgaggagtgtaaaagtcagat 
agctggatgtattccaccaagtttccatttagattacactcttttcaatttga 
tgagtactgttatgcgaatgaatgaaaactcaagcccgacacatatgaagga 
acgatgcaaaattgcaattcaagagtgcacagaagctcatattagtcgttgg 
agagcacttccgtcagttgtttcatatggtcatgtcaagattcttcaggcaa ' 
tgaacttggttcgagaaattgaagagtctacagatattcgcattgctctgct 
cgaggccccatcaaa^aaagxggatcaggcgttgatgggcgatatgaagtc 
gngatgaaagtattccgaaatag^ 

ttcgtttcgacttggtatgattggaggaatcagattcatggaatgatgcttc 
aaagattcgaatattgggataaagtaggactcaacgtcgctgcaactggaaa • 
ccagtcaattgttccgattcattcaatggctcaagcacagttggccgtagcc 
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AAACATGCCAAGMTCTTGGATTCCATAATTTAAC6AAAGATCTACTCAACAA 

ATTAGCTGGATTGACAGCCATACCGATGATGGATGCTCAAGATAAAGTTTGC 

ACTTACGGCAAGACACTTCGCGATATGGCAAACAGTGCGGCTGACGAAAGA 

GT GAAAAATGAGCTATTGT GTGAAGCGCTTGAAGTTTTGG AAGATGTGCGAA 

TTGATGATCTACAGAAGGATCAGGTTGCTGCATTGCTTTATCATCGTGGTAA 

TATTCATTCAGTTCTTGATCAAGCTGAAAATGCTGACTACACCTTCTCCGCA 

GCCTCTCAACtTGTCGACTTGCAAAATAGTGTGACAACCACTGGAATCAAGC' 

TCATGAAAAATTGGGGCCACCATCTTTACAAGAGATTCTTCTCTACGACAGT 

TTGCAAGGAAACCGGAAACAACTTCGGACGGCAGGCTCTCGCTTGTTACTT 

CATTGCGGCTCGTGTGGATAACGATATCAAGGCGAGAAAACCGATTGCGAA 

GATTTTGTGGCTdTCGAAGCACTTGAATGCGTGTGGATCACATGAAGTGAT 

GAAT C G G GTTATT AAG AAGC AACTTCATT CACTTAATCTCTTCAATTGGCTTT 

ACTGGCTTCCACAATTGGTTACTGATGTTCGATATAAACCAAATTCGAACTTT 

GTTCTGATTCTCTGCAAGATGGCTGCTGCTCATCCACTTCAAGTATTTTACC 

ACATTCG G GAGGCAGTTAGCGTT GACG ATATT G ACT CGGTTCTCGAAG AAG 

ATTACACTGATGAGCAMTGTCGATGGATGTTTCGGATGAGGATTGTTTTGC 

AGACGATCCACCATTTGATAGAATTCTGAAAATATGTCTGAAATATCGTCCAA 

CTGATATTCGAGTCTTCCATCGTGTCCTCAAAGAACTTGACGAGATGAATGA 

GACATGGGTTGAACGTCACTTGGGTCATGCGATCTGCCTCAAGGATCAGAT 

GTTCAAAGATTTCTCGGAACAAATGGACGCGACGTTCAATGAGATGCAATAT 

TCGGAGGATGTGACTATGATGACGTTGAGATGGAGGAAACAGCTGGAAGAA 

GACTTGGTGTATTTCCAACAGAATTATAATCTTGATTTCCTGGAGATTCGTAA 

CAAGCGAAAGATGATCGTGACGAAGGGATGTATGGGAGTCGAGAAAAGTCA 

GATMTGTTCGAAAAAGAGCTGAGTCAAGTGTTCACAGAGCCGGCCGGCAT 

GCAAGATGAATTTGATTTTGTCACAAATATGACTAATATGATGGTCTCACAGT 

TGGATATTCATGCAGTCGATGCTCCACGCCCTCAGGGATATATTCGTATTGT . 

TCTCGACTGGATTCGAGCGATTCGTCGTCGTTTCGATCGACTTCCACGAAG 

AATCCCTCTGGAATCGTCAAGCCCATATCTCGCC AGAT TCAGCCATCGTACA 

GGATGCATCGAAATCCGATACGATTTGGTGMCGTTTTGCGCGCCAAGAAT 

CATACTCTGATGGCTTCCAATCAAACGGGGCAATACATATCCATGCTCTCTC 

6ATTTGAGCCAAACTTTGAGATTGTGATCAAAGGTGG TCAAG TGATAAGAAA 

GATCTATATTCGAGGACAAACCGGAAAGAGTGCGGCGTTTTATCTGAAGAA 

atctgtgcaggatgagccaactaaccgagttccacaaatgttcaaacatctt 

gatcacgttctacaaaccgatagagagtcggcgagaagacatcttcatgct 

ccaacagtgctgcagatgagagtcggacagaagacgacactctacgaagtt 

gcatccgttcaaccatatgcaatgccaccggattgtaccagaaactatccag 

catcacaaatcgacattgttcatccatatgatgtgctgactgccactttcaat 

ggaagttattatccggatgatatggtattgcacttctttgagagattcgccc 

aaagttcttcatccatcggacaacctcttccaactccgacgaaccaagatgg 

aacagttgctccgccacgactaacggaagctcaccacatcaagaatattatt 

tatgaagactttgcccgagatatgatcccattccgacttctctacgactacc 

tcactgcacgatatcctgatccggttatgtactatgcaatgaagaagcaatt 

ggtggacagt-gtgggggtgct-atcgagaatcgaatatcattggaatctgaga 

ccaatgggacctgatcaaatgatgatgacaatgaatactggagtccttagca 

atccttcatatagattcgaaatccgaggaggacgatcacttcatgatattca 

acactttggacatgaagttccatt'ccgattgactccaaatctatcgattttg 
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GTTGGTGTTGCACAGGATGGTGACTTGTiFATGGAGTATGGCTGCTGCGTCA 

AAATGTTTGATGAAGAAGGAACCTGMGTTATCATGAGACCGTTAGTATGGiG 

ATGAATTCGCCAACAATACAGATTGCGACAAA7CGCGTTTGCAGGTATTCGC 

GTGTCATGCATCGAATTCTTACATCAATGGTGTCGCGAGCAAGCTTCGAAAC 

ACGMTAGCGCCGACGGCAMCTCAGAAAGGACGATTGTGTGTCGCTGATC 

AGTCGAGCCAAGGATTCGGATAATCTGGCCCGAATGCCACCCACCTACCAC 

GCGTGGTTCTAG , 
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TRR-1 protein sequence 

MDPAMASPGYRSVOSDRSNHLTELETRIQNLADNSQRDDVKLKMLQEIWSTIE 
' NHFTLSShEKWERLILSFLQVFCNTSPOFIAENNTOQLRKLMLEULRLSNVEAM 
KHHSKEIIK0MMRLITVENEENANLAIKIVTDQ6RSTGKMQYCGEVSQIMVSFKT 
MVIDLTASGRAGDMFNIKEHKAPPSTSSDEQVITEYLKTCYYQQTVLLNGTEGK ■ 
PPLKYNMIPSAHQSTKVLLEVPYLVIFFYQHFKTAIQTEALDFMRLGLDFLNVRV 
PDEDKLKTNQIITDDFVSAQSRFLSFVNIMAKIPAFMDLIMQNGPLLVSGTMQML 
ERCPADLISVRREVLMALKYFTSGEMKSKFFPMLPRLIAEEWLGTGFTAIEHLR 
VFMYQMLADLLHHMRNSIDYEMITHVIFVFCRTLHDPNNSSQVQIMSARLLNSL • 
AESLCKMDSHDTFQTRDLLIEILESHVAKLKTLAVYHMPILFQQYGTEIDYEYKSY 
ERDAEKPGMNIPKDTIRGVPKRRIRRLSIDSVEELEFLASEPSTSEDADESGGDP 
NKLPPPTKEGKKTSPEAILTAMSTMTPPPLAIVEARNLVKYIMHTCKFVTGQLRIA 
RPSQDMYHCSKERDLFERLLRYGVMCMDVFVLPTTRNQPQMHSSMRTKDEK 
DALESLANVFTTIDHAIFREIFEKYMDFLIERIYNRNYPLQLMVNTFLVRNEVPFF 
ASTMLSFLMSRMKLLEVSNDKTMLYVKLFKIIFSAIGANGSGLHGDKMLTSYLPE 
ILKQSTVLALTAREPLNYFLLLRALFRSIGGGAODILYGKFLQLLPNLLQFLNKLT 
NLQSCQHRIQMRELFVELCLTVPVRLSSLLPYLPLLMDPLVCAMNGSPNIVTQG 
LRTLELCVDNLQPEYLLENMLPVRGALMOGLWRWSKAPDTSSMTAAFRILGK 
FGGANRKLLNQPQILQVATLGDTVQSYINMEFSRMGLDGNHSIHLPLSELMRW 
ADQMRYPADMILNPSPAMIPSTHMKKWCMELSKAVLLAGLGSSGSPITPSANL" 
PKIIKKLLEDFDPNNRTTEVYTCPRESDRELFVNALLAMAYGIWNKDGFRHWS 
KFFIKVLRQFALIGVLEYIGGNGWMRHAEEEGVLPLCLDSSVMVDALIICLSETS 
SSFIIAGVMSLRHINETLSLTLPDIDQMSKVPMCKYLMEKVFKLCHGPAWYARS 
GGINAIGYMIESFPRKFVMDFVIDWDSIMEVILGTVEEISSGSADSAYDCLKKM 
MRVYFIKEEGQEEENLTLATIFVSAISKHYFHSNERVREFAIGLMDHCMVHSRLA 
PSLDKFYYRFKEFFEPELMRVLTTVPTMSLADAGGSLDGVQNYMFNCPDGFDF 
EKDMDMYKRYLSHLLDIAQTDTFTLNQRNAFKKCETCPSHFLPPFPITTHIDSMR 
ASALOCLVIAYDRMKKQYIDKGIELGDEHKMIEILALRSSKITVDQWESDESWR 
RLMTVLLRAWDREfPEiAEKtHPSLLKVSPiSTillAt 

DRHISYNDIMKFKCLVELNPKILVTKMAVNLANQMVKYKMSDKISRILSVPSSFT 

EEELDDFEAEKMKGIRELDMIGHTVKMLAGCPVTTFTEQIIVDISRFAAHFEYAY 

SQDVLVNWIDDVTVILNKSPKDVWKFFLSRESILDPARRSFIRRIIVYQSSGPLRQ' 

EFMDTPEYFEKLIDLDDEENKDEDERKIWDRDMFAFSIVDRISKSCPEWLISPNS 

PIPRIKKLFSETEFNERYWRALTEVKKFOEEIIVKRMTEHKYKVPKLILNTFLRYL 

RLNIYDYDLFIVIASCFNGNFVTDLSFLREYLETEVIPKVPLQWRRELFLRIMQKF 

DTDPQTAGTSMQHVKALQYLVIPTLHWAFERYDTDEIVGTAPIDDSDSSMDVDP 

AGSSDNLVARLTSVIDSHRNYLSDGMVIVFYQLCTLFVQNASEHIHNNNCKKQG 

GRLRILMLFAWPCLTMYNHQDPTMRYTGFFFLANIIERFTINRKIVLQVFHQLMT 

TYOODTRDQIRKAIDILTPALRTRMEDGHLQILSHVKKILIEECHNLQHVQHVFQ 

MWRNYRVYYHVRLELLTPLLNGVQRALVMPNSVLEKFSWQTRRHAVEICEMV 

IKWELFRTLKTDHIISDEEALEVDKOLDKLRTASSTDRFDFEEAHNKRDMPDAQ 

RTIIKEHADVIVNMLVRFGMTF-HQNSGSSSTSQSGNHGVELTKKCQLLLRAALR 

PSMWGEFVSFRLTMIEKFLSIPNDNALRNDISSTAYANTIONAQHTLDMLCNIIPV 

MPKTSLMTMMROLQRPLIOCLNNGAQNFKMTRLVTQIVSRLLEKTNVSVNGLD 

ELEOLNQYISRFLHEHFGSLLNCRNLSGPVLGVLGAFSLLRTICGHEPAYLDHL 

MPSFVKVMERAAKEHLAYVANSQDGNMVKNFFPDVAELLCACMELVRPRVDHI 
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SMEIKRSIVGGIIAELIIKSNHDKIIQTSVKLLGAMISTQDMEFTILTVLPLLVRIQSII 

VTKFKNCKDLIADYLVWITVFENSEYRNSEAGSRLWEGFFWGLKSSDPQTREK 

FSIWEKTWPHMATVDIAHRMKY1MQNQDWSKFKHAFWLKFALWGMLRT1AKR 

PTDPNNKRKKVILLNCATPWRTIEYAAKLKDQPMEVETEMKREEPEPMEVDEK" 

DSQDDSKDAGEPKEKEKLTLELLLAGQQELLDEASNYDFADALDTVSQITFALN 

ENQVTSKMWWLFKSPWSSLSQSEIEDFTALWPFMSSGVHNNYQTGVQDSV 

LAVWLEAVGDAVHLPSRLIEFISSKHECWHTGIRLLENHIWTIPKQLNNTLLREM 

KVAPGLAGDIETLESLGTLYNEISEFDQFAAIWERRAVFPDTMRAMSAMiQLGD 

MELAQSYLEKSMSSTYETLAPTINPNNTSNSEKHVSPIIDKEYDHWMEMYITNC 

SELLQWQNVADVCNGKDMQHVRGLINAASHIPDWNWEECKSQIAGCIPPSFH 

LDYTLFNLMSTVMRMNENSSPTHMKERCKIAIQECTEAHISRWRALPSWSYG 

HVKILQAMNLVREIEESTDIRIALLEAPSNKVDQALMGDMKSLMKVFRNRTPTTS 

DDMGFVSTWYDWRNQIHGMMLQRFEYWDKVGLNVAATGNQSIVPIHSMAQA 

QLAVAKHAKNLGFHNLTKDLLNKLAGLTAIPMMDAQDKVCTYGKTLRDMANSA 

ADERVKNELLCEALEVLEDVRIDDLQKDQVAALLYHRANIHSVLDQAENADYTF 

SAASQLVDLQNSVTTTGIKLMKNWGHHLYKRFFSTTVCKETGNNFGRQALACY 

FIAARVDNDIKARKPIAKILWLSKHLNACGSHEVMNRVIKKQLHSLNLFNWLYWL 

POLVTDVRYKPNSNFVLILCKMAAAHPLQVFYHIREAVSVDDIDSVLEEDYTDEQ 

MSMDVSDEDCFADDPPFDRILKICLKYRPTDIRVFHRVLKELDEMNETWVERHL 

RHAICLKDQMFKDFSEQMDATFNEMQYSEDVTMMTLRWRKQLEEDLVYFQQN 

YNLDFLEIRNKRKMIVTKGCMGVEKSQIMFEKELSQVFTEPAGMQDEFDFVTN 

MTNMMVSQLDIHAVDAPRPQGYIRIVLDWIRAIRRRFDRLPRRIPLESSSPYLAR 

FSHRTGCIEMPYDLLNVLRAKNHTLMASNQTGQYISMLSRFEPNFEIVIKGGQVI 

RKIYIRGQTGKSAAFYLKKSVQDEPTNRVPOMFKHLDHVLQTDRESARRHLHA 

PTVLQMRVGQKTTLYEVASVQPYAMPPDCTRNYPASQIDIVHPYDVLTATFNG 

SYYPDDMVLHFFERFAOSSSSIGQPLPTPTNQDGTVAPPRLTEAHHIKNIIYEDF 

ARDMIPFRLLYDYLTARYPDPVMYYAMKKQLLHSLAVLSTIEYHCNLTPMGPDQ 

MMMTMNTGVLSNPSYRFEIRGGRSLHDIQHFGHEVPFRLTPNLSILVGVAQDG 

DLLWSMAAASKCLMKKEPEVIMRPLNWDEFANNTDCDKSRLQVFACHASNSYI 

NGVASKLRNTNSADAKLRKDDCVSLISRAKDSDNLARMPPTYHAWF 
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FIGURE 11: 
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hat-i genomic sequence 

ttgtttrcggattttttgtgtgcttcgtagttgctccgatgatgccggattc 
mcatttgmtgtmcatngmttttgaaattgaaggaattcatrgaatcta 
aagcttgcagggtcmgaccgatacattcttgcaacacatgactcgaaagta 
tgtaggaaamttgmgttggaaacttggaatttgat gaaaaa gtacagtAa 
tccattctctcttatttcgcmctitcttcgatttttgattt^ 
naagctaamttttgctgtmattttcamttcatgcttttcmmcggtt 
ttcaacaaaattatgtttttcagagaaaatctcgtgaacaataactcgg ctac 
igtaccatttaaaggcgcagaccttttcgcgcagcattgatttaaattttttt 
: gttcgtggctcaacagtgcaatggacatctagatatctgaaattttaccact 
gaattcagttcattttttmgcatcncaaamtttgcgttttcctmtm 
tgtgatcgt i i i i 1 1 i i i gaaagtacaatcgtacattataaataacta 1 1 1 i ic 
aattcgaataatttaattcaagatcatttcgcaaaatmttgccttgaaacgt 

TATGCCGCGGTCMTTTTCMGGACCCTTGTTATTCTTTTTTGAATTGCCGCC 

CTTTTTCCCTGTGGCCGGCGCAGTGCGGCCGAGGTTGGTTTCTAGGCCAG 

CCGGCGCGTTTTATTTTTTTCGAGCATGATTTCACAATTATTTC7TGCAI I I I I 

AAAGTTTTTTATTGATAAMTAGTAAAACTAACMCGGATAATATTATTTTAAA 

ATTAAAAMCTAGTTTGTTCATT7TTGGATCGATTTTTAGATGTTGTTCATGGA 

TTATGCACGCAAGAAAGTACTATCGTTCACATTTGATTGCTATATTATrGAAT 

ATTGMTTTTTCACACAAMTTGTACTATTTCCAGATATTTATC^^ACCGAG 

CCGAAGMGGAGATTATAGAGGACGAAAATCATGGMTATCCAAGAAAATAC 

CAACAGATCCCAGGCAATACGAGAAAGTTACAGAGGGATGCCGGTTATTGG 

TCATGATGGCTTCACAAGMGMGAAAGTTAGTTTTTACATCTATTTAAACAC 

ATTTTCCMTTATTTTCAGGATGGGCCGAAGTTATTTCAAGATGCCGAGCTG 

CAAATGGTTCAATTAAATTCTATGTCCATTATATCGATTGCAACCGAAGACTT 

GACGAATGGGTTCAGTCTGATAGGCTCAATTTAGCGTCGTGTGAGCTACGA 

AAAAAAGGAGGAAAGAAAGGAGCACACTTGCGGGAAGAAAAGTGAGAAATC 

TATAAACTTTTCAAMGATTTTAAATAGTTTTATCAATTCATAATTATTTCAGTC 

GAGATTCGAATGAAAATGAAGGAAAGAAAAGCGGCCGAAAACGAAAGATTC 

CACTACTTCCGATGGATGATCTCAAGGCGGAATCCGTAGATCCATTACAAG 

CAATTTCAACGATGACCAGCGGATCTACTCCAAGTCTTCGAGGTTCCATGTC 

GATGGTCGGCCATAGTGAAGATGCAATGACAAGGATCCGAAATGTCGAATG 

CATTGAACTAGGAAGATCACGAATTCAGCCATGGTACTTTGCACCTTATCCA 

CMCMTTGACAAGTTTGGATTGTATTTATATTTGCGAATTTTGTCTGAAATA 

TCTAAAGTCGAAAACTTGTCTGAAACGGCACATGGTGAGTGTTTCGAGTTAT 

AGAAMTGACCGMTATAMTAACTGTTTTCAAAATTCAAAAATTTTCAATTTT 

CCAAAMTGAAAGMTCGGTGMTTCGAAAAAATTCGAGTTCTTGTGTGTTTT 

TGGCTGMTTTTTCGGTTTTTCTTGCTTTTTCCGTTGATATTAGTTTTGAAACA 

ATGTTTTTAAMTTTTCCGGCATCGAAAAAAATCGCAAATTCTGGGAATTTGC 

T C C AAAAATT G C ATTTTT G AAAT A CTTTTTT G C G AAAAC G AAAAAAAAATT C A 

CAMCGGTGTTTCAMCCAAATTTATCGTAATCAAAAAAGTTTCGCAAATAGG 

CCATTATTCTGCGTGGGMTTCAAATTAAAATCAGCTACTrrTTCTATTTTGC 

AAMTGGAAAAAAMCGTAAAAAATAGACAAATTTTTAATTTm 

CATTCGGTCCATACTCTTCATTTTCTATCATTTAATTAAAATGCCCAATTCTAA 
TTAATTTTATTTCAGGAAAAATGTGCAATGTGTCACCCACCTGGCAATCAAAT 
CTACAGTCACGATAAACTTTCATTTTTTGAAATCGACGGCGGCAAAAACAAA 
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AGCTATGCTCAGMTCTATGCCTGCTTGGCAAACTTTTTCTGGATCACAA6A 

CTCTTTACTATGACACGGATCCATTTTT GTTCT ATGTGCTAACCGAAGAAGA 

.CGAGMGGGTCATCATATAGTTGGATACTTTTCAAMGAAAAAGAATCAGCT ■ 

GAAGMTATMTGTTGCGtGTATTCTTGTGTTACCTCCATTTCAAAAGAAAGG 

ATACGGAAGTTTGCTCATCGAATTCAGCTATGAACTCTCGAAAATTGAAGAG 

AAGACAGGATCACCCGAAAAACCACTATCAGATTrGGGACTTCTCTCATATC 

GATCGTACTGGTCMTGGCCATCATGAAAGAGCTTTTCGCATTCAAAAGACG 

ACATCCAGGCGAAGATATCACAGTTCAGGACATTTCACAAAGTACATCGATT 

AAACGAGAAGATGTTGTGTCAACGTTACAGCAACTTGATCTATACAAATACT 

ATAAGGGATCATACATAATTGTGATTAGTGATGAAAAGCGTCAAGTTTATGA 

GAAACGGATTGAGGCTGCGAAAAAGAAGACACGAATTAA TCCAGCA GCTCT 

GCAATGGCGACCGAAAGAGTACGGAAAGAAAAGAGTGAG 1 1 1 1 1 1 ICAATCA 

AAMTTCGTGTTTACGGCTAAAMCTGAAAATTAAMTTA AATTAA ATTCGTG 

ATAACA1 II I I II I I CAAAAAACCAAAAAAAAACAATTTCG I 1 1 I I GGCAGAAC 

caaaaaaaamtttaaaaaaaaacggtttacgccct atttcat acaaacaaca 
g aaattg c actttttt gag c amtttg accct acmtttttttccagttttttg 
ctctttttcaaaaaaaaacacctaaacactggaaatact aaata ctaaggaaa 
aaaatggamtactggtttacagtgtcaaaaaattgaaattttctaata aaat 
catttttc i 1 i i i act/w\tttatcaaamtttatmctcaaatctttcagtttt 
tgcgmttttttttcgaaaaaacgaaaaaaaatamcctmttItaaccaaatt 
gtaattttgaaaaatctggaacgtccg gaaaac tgaaamt taaa aaaaaaag 
ttttcagaaatttatttttaaaaaaccg i mill iaa atca aattttgtatatgt 
tgatgagaaaaaaaaatagaaatcaatg i i i i i aagttttaaaagaaaaattta 
ttttmttattttagttttmtaaggtatttaaacagtaacaaggatgtcggtt 

TTTCGATTTTCCGAAAAACTAAAAAATTGTCl I 1 1 iCGAl till I AATCGAAAA 
AAAATAGAAATATTTTCACAAAACATACTATT CTTCTAA AAAAAA6AATAGTG 
GGAGATTTTAAATAATTTTTGAACTGTCGGAATTTTTTTCGAAATATGeAAAAA 
TCGAAAMCCGGCACAAMGCAAA^GTCTCCGGGjWATATCTTTAAATTA 
TTTTATGAAC I I I I I I I ICAGGCGCAGATCATGTTC^^CAACAACGACATGT 



GTTCTCGCCACGACGATCTCAACCTGTACATTAAAATATAACACTCCGTTTTA 
TCTCGCATCTACACACCGAAAAGCTTACGCTATCCCTTTATCATTCCCACAC 
CGCTCAGAGAGCGTACGCCTCATTTCATTTCATTTGTTCTGTGTAATAATTTG 
ACTTATTAGTCACTTA I 1 1 I I I I AATGAAATTATTCTTGAATTTCATAATCTTCT 
TGTTGCAGTTCAAATAATTAAAATTCATCATATAGACAAGTAAGTTTATAACT 
GCAAAAGTGAAGTTTTCTAATCATTAAGCGTTCTGAAGATATTCGGCAACCG 
CCTGAGCGATCAGATCACGGCGGGAACGAGTTGAGGCGTAGACATGCTTG 
CAGCCAGTGACAACCTGAAAGATATTCAAAAAATTAATTTCAGGACTCGAAT 
TTTTMCMTCTGMTAAAAAAATCCAAAATTGTATATTATAGAGTTTTTTGAA 
ATCTAAGCGAAAGCGCGCTCCAATGTAAAACGAAAAGTGCTCCGCCCCTAA 
ACGTTGGGTCCCGTTAGGAATTTGTTATTTTTTCGGTTATTTCTGACTATATT 
ATMTTTCGAMCGACMGTATTTTAAACATCATTTCGACATAAAAAATATGT 
AAAACAACAAAAMCAATCGAAAAAATAGTGAAAAAGTTTGAATTTACAGTCT 
CGCCGCCTCCTACCGAGACGTAACGTTAGGAGGGGGAGGGTTTTCCTTTGG 
CATTGAAGCGCGCTTGCTGCGGCCCCATAATTAATAACTTACAGCCTTTGCA 
AAGTCCTTCTTCTGTTCATCCTCAATCTCGTCAATGTATTGATTGGACAACTT 
CTCAATCTCGGACTGTTCCGCATTTTCATCCTTCAATTTTTTGTATTGAGCCT 
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TGAATTGAGCCACCTTCTCCTCTCCGAAAGCCTTAACC6AATACTCCTTACA 
AGCTTCTTTCAAGTTGCCCTCGGCCTTCTCCTTGGCATCTC 
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FIGURE 13 



hai-1 ORF 

atgaccgagcc6aagaaggagattatagaggac6aaaatcatggaatatcc 
' aagaaaataccaacagatcccaggcaatacgagaaagttacagagggatgc 
cggttattggtcatgatggcttcacaagaagaagaaagatgggccgaagtt ' 
atttcmgatgccgagctgcaaatggttcaattaaattctatgtccattatat 
cgattgcmccgaagacttgacgaatgggttcagtctgataggctcaattta 
gcgtcgtgtgagctaccaaaaaaaggaggaaagaaaggagcacacttgcg 

GGAAGAAAATCGAGATTCGAATGAAAATGAAGGAAAGAAAAGCGGCCGAAA 

ACGAAAGATTCCACTACTTCCGATGGATGATCTCAAGGCGGAATCCGTAGA- 

TCCATTACAAGCAATTTCAACGATGACCAGCGGATCTACTCCAAGTCTTCGA 

GGTTCGATGTCGATGGTCGGCCATAGTGAAGATGCAATGACAAGGATCCGA 

AATGTCGAATGCATTGAACTAGGAAGATCACGAATTCAGCCATGGTACTTTG 

CACCTTATCCAGMCMTTGACMGTTTGGATTGTATTTATATTTGCGAATTT 

TGTCTGAAATATCTAAAGTCGAAAACTTGTCTGAAACGGCACATGGAAAAAT 

GTGCAATGTGTCACCCACCTGGCAATCAAATCTACAGTCACGATAAACTTTC 

ATTTTTTGAAATCGACGGCCGCAAAAACAAAAGCTAtGCTCAGAATCTATGC 

CTGCTTGCCAAACTTrTTCTGGATCACAAGACTCTTTACTATGACACGGATC 

CATTTTTGTTCTATGTGCTAACCGAAGAAGACGAGAAGGGTCATCATATAGT 

TGGATACTTTTCAAAAGAAAAAGAATCAGCTGAAGAATATAATGTTGCGTGT 

ATTCTTGTGTTACCTCCATTTCAAAAGAAAGGATACGGAAGTTTGCTCATCG 



ACCACTATCAGATTTGGGACTTCTCTCATATCGATCGTACTGGTCAATGGCC 

ATCATGAAAGAGCTTTTCGCATTCAAAAGACGACATCCAGGCGAAGATATCA 

CAGTTCAGGACATTTCACAAAGTACATCGATTAAACGAGAAGATGTTGTGTC 

AACGTTACAGCAACTTGATCTATACAAATACTATAAGGGATCATACATAATTG 

TGATTAGTGATGAAAAGCGTCAAGTTTATGAGAAACGGATTGAGGCTGCGA 

-.AAAAGAAGACACGAATTAATCCAGCAGCTCTGCAATGGCGACCCAAAGAGT 
ACGGAAAGAAAAGAGCGCAGATCATGTTCTAG 
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HAT-1 protein . 

MTEPKKEIIEDENHG.ISKKIPTDPRQYEKVTEGCRLLVMMASQEEERWAEV.ISR . 

CRAANGSIKFYVHYIDCNRRLDEWVOSDRLNLASCELPKKGGKKGAHLREENR 

DSNENEGKKSGRKRKIPLLPMDDLKAESVDPLQAISTMTSGSTPSLRGSMSMV 

GHSEDAMTRIRNVECIELGRSRIQPWYFAPYPQQLTSLDCIYICEFCLKYLKSKT 

dLKRHMEKCAMCHPPGNQIYSHDKLSFFEIDGRKNKSYAQNLCLLAKLFLDHKT 

LYYDTDPFLFYVLTEEDEKGHHIVGYFSKEKESAEEYNVACILVLPPFQKKGYGS 

LLIEFSYELSKIEQKTGSPEKPLSDLGLLSYRSYWSMAIMKELFAFKRRHPGEDI 

TVODISQSTSIKREDWSTLOOLDLYKYYKGSYIIVISDEKRQVYEKRIEAAKKKT 

RINPAALQWRPKEYGKKRAQIMF 
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FIGURE 15 



A. 




n4076 

0.5 kb 



B. 



SI.1 H 
n4077 
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epc-1 genomic sequence 

rrrcaaaaaaaaaamttacctcgtcaatttcactctcgtcgatgcgatgatt 

atcctcgtccamtacctgaaaagtgtgattttttcacgaataamtotrtt 

cagatacttctagaaaaaaaaaactgmcggmtgttacgaaattmttttca 

mgttgcgamctgmttttcgaeaaamgtttcactgatattcatttcaagc 

atattgc mcgtt tnamttmtttctmgagaaaaaactgcaaaacaattc 

' g aamtmt tttta cmgttactttt cgaaamgtmcaaaaatccactaatg 

mcmgamttmgmcaaamgagcttctcaggctatttttggacgaatat 

tttaataaaactttaaaaAaatcaacgaamtcccctaaaaatcgctgaaaat 

tccaaaaattaaagttcattctcgaccacacctctcgtaaatcagcacgaga 

ctcacgcaacgcgaccgcgccgcactcaacggcattgagtaatgcggagc 

ggcagcgtcgcgtcgtctatttgtgtgtgtgtgcgattgtgtgtggtgcga 

cgtggccgctctgtgtgcctctctagtgagtgttttccgaggagagacaac 

acattttcgagagacgaagagagtggggacgaggaagatagtgtggtaaga 

ggagagtgtgcgcgagggaaagagagcaaagtgtgagtgtctgtgagaag 

agaaggagacccccccccccccgcgctcaaccagtcgatagttggcctga 

gtgtagggccttctgttgtattccactgctaaccccccccaaacacacaaaa 

agactcaaamgtactgcttaaaacacagtgctcagctcatttcatttttgat 

ttttatgctcgccgtcatcggcggatgaattcatcgcaaagtccgtggcga 

ttcaacacgtgcggcgtcctcgccgctcttcttaaccgtagttacaacgtg 

ggagtacagaaagggjgccactacttcgaaggcgtttcgagcccgggcgc 

tcgactggaaccggtctatgactgtatactggggccacgaacttccggacc 

tatcagaatgcagtgttggaaaccgggcggtgacacaaatgccgtctggca 

tggaaamgaagmgmcaggttggtttttggtggattatggattactgctc 

cattttgamtmtcgagttttaatgtctttmcgmttcctggtgcttttt 

tctatccgmtcatgttttmttccgttttccgactactttgaagaattttca 

mtttttgatccctga tgac gtcactatttttgtctttgcctttctggatcgc 

ttttatagttattttcattttttatttc i i l i i i acacttttaaacttaacaattc 

tcttaattcatcctattctatttaattttaagttttgatttttgatttt^ 

ttctcttttctcttttagccgccggtgggcctttattacaactcttaaatcat 

aaaaaaaatcagtttaagcagttatacataactcttattatgaaaaaatcgtta 

TTTTTCGACGGAAACTTCATACTTTGAATTTATTTCCAATTTAGATTTTATTTT 

CT CAAAG TCAGC TCAAT TAACTAACTTAAAATGTTTTGTCCTACCCGCAAAAT 

GTTI KIM I AATATTTTMTTCTATTTTMTTTTTGGCTTTAAAAMTCATTTT 

GCTMGCCTGAGATGAAGGCGAAATCTCGAGAAAAAGCATTTAAAAAGTAAT 

AMTTCCGTTAAAMCGACTTTTTCTATCACAGAAAGTGTTCTCTGAGTGCTA 

ACMC CTTCn CTGTCCAMTTTTGACACAATTTCCCAATTATGCCGACTTAT 

TACACCTTTTTCCGTCAATCTTCTAGTTTTTCCCACCCTCTTGACCCCTGGTG 
ACGTCATTTGTTTGTTCTTCTTCCAAGACATGCCCTGTGGGGTATTTTTTCTC 
AAMTTTTTGCAAATTTATTGGATTCTAAATAAAATTCCAGGAGTCTAGCACC 
AGGMTMTAATGCAMTTTGAAAAAAAAATTAAACAGAAATAATGATTTTAA 
AT GATTATTTAAATTTTAMTTTT AAATTTC CAG G AAAAACACCTGCAAG AAG 

eGATTGGTGGGGAGGAAGGGAGTAGATGGGGTATTGAGGT-GAAGGAT-GTGA 
TTCCAACTCCAAAAGTCGACCGAGTCGAAGATCAACGCTATCACTCCACTTA 
TCACAACAAGAATAAAATGCACCGTTCAAAGTATATCAAAGTTCATGGTGAG 
TTTTmMCCAAMmCGGCGAAAATAATTTAAmCCGGmmGAMTT 
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AATTTCCGCTTG G GtTTrCTTGTATTTATTATTTTTTGAAATTCCTCTCTG AAT 

T C G AMGAAMTMCHG ATTTTTCAG ACTTCCTGGCTAAAACCTTCAAAAAT 

GTTTGTTGATTGGTTCCAMTTnCGCCTGATTCCGMTTTGGAtGTGACAAA 

TTCAAAAAAAMTTCCCTGATTnATATTCMGCTnGTGmGTGTGTTCTTT 

nGGAGCGCGCTTGCATCGTTTGATmCTTCGTCTTTmAAMTTTATTTTC 

GCTTGTTTGATTCATrTnGTCGAGTTTTTTTCTGCCAAAATGAATGAAACTG 

GTTTAAAAMTTGMTTCGGCGAAMTAMTTTTGAAAAACGAAACAAATCAA 

ACGATGCMGCGCGCTCCMTGCGATTTTTTTGGGCGCGGAAATTCGTGAT 

TTCAAGCTTAMTATAAMTCAGGTATATTTTTTCGACTTTTTTCACGTTGAAA 

TTCGGMTCAGAGGAAMTTTTGAGTCAATCAAAAATATTTCCCAGATTTCG 

GTATCTTTMTGCATCAAAAATGAACTTTCACCCCC ATACT CCCA6AAAAATA 

AGAAAACAAATTGCGAAATATTGTTCCCTGATCAAAI I I 1 1 TCTTTTTTTAACT 

ACACTTCTCTGTTTTGMGTGAGAMGTACATTTTTCTGCGTTTCTTATCAGT 

TATCATTTGAAAAGGATCAGMTTTGATGACGATATATTTGTTTAGTTACCTC 

CCTTTTTTCTGAACAGTTTTTGCGAAAAAAGGAGAAAMCCGGAATTTTCTAT 

GAAAAT-GTGATTTATTTTCAGCCTGGCAAGCACTCGAACGAGACGAACCCG' 

AGTATGACTACGACACAGAAGATGAAGCATGGCTATCAGATCACACTCACAT 

TGACCCGCGCGTTTTGGAAAAGATATTCGACACAGTGGAGAGCCATTCATC 

GGAGACACAGATCGCGAGCGAAGATTCGGTGATTAATTTGCATAAATGTAA 

GTTGACGAAATTTCCATTGAAACCCCCCCCCCCCAAAAATATCGTTTAATTG 

GAGCACTGGACTCATCAATCGTGTACGAAATATACGAATATTGGCTGTCGAA 

GCGAACATCGGCTGCGACGACGTCTGGTTGTGTTGGAGTCGGTGGATTAAT 

TCCGAGAGTCAGGACAGAATGTCGGAAGGTAAGAATTTGACTATTTTGAAC 

GAATTTCGTGATGAMCTTCTCTAAAACTTTTAAAGTnTTTATGGCGGTTCA 

AAATTTCGGAAAATTTACACTGATTTTAGCTAAAAACTTGMTTTTGGTCATTT 

GTCCGTGTCACATCTGTCCGAAATCGACTTTTTTTGGAATTATCATCCTTTAT 

TGCACATTTGGCTAGTTTATCTCATTTAATTTCGTTGATTACTAAGGTACATTT 

AAAGCCMTAGGTMCCMCCAAAAACTATCATAATTTTTCTACACTTTTTAA 

TTTTCCGACACTACTTGMTMCCCCATAAGTGACCAATTTTGATAGTTTTTG 

GCTGGTTACCGGCTTTAAATGTACCTTATTAATCAACAAAATTAAATGAGATA 

AACTAGCCAAATGTGCAATAAAGGATGATAATTCCATAAAAAGTCGATTTTG 

GACAGATGTGACACGGGCAAATGACCAAAATTCAAGTTTTTAGCTAAAATCA 

GTGTATTTGTTTCGMGTTTTGAACCGCTATAAAAAAATTTTTGGAATGCTTT 

TGGCMGTTTCATTACGAMTTCACTCATTTTCTATACGCAAAAA7TAGAATT 

TTCAATTAAAAATT CATTTTACAGGAT GGAC AAG GTGTTATC AATCCGTACGT 

TGCATTCCGTCGACGTGCCGAGAAAATGCAGACTCGAAAGAATCGGAAAAA 

CGATGAAGATTCGTATGAGAAGATTCTCAAGTTGGTACATGACATGTCGAAA 

GCTCAACAGCTCTTCGATATGACTGCCCGACGAGAAAAGCAGAAGCTCGCG 

TTGATTGATATGGAATCGGAGATTTTAGCGAAACGAATGGAGATGTCAGATT 

TTGGTGGTTCTCCGAGTTCGTTCAATGAGATCACCGAAAAGATTCGAGCAG 

CAGCAACGTTGGAAGTCGTGAAACCACCACTGGCAGAAATCAACGGATCAG 

ATGAAGTGAAGAAGAGGAAGAAGCCGAGACGAAAGATTGCTGATAAGGATT 
TMTATCGATWGCCTGGCTTATW^AGAXTGC^^ 

CGTCGCTCTTTGGACAACACAGTGGAAATGTTCCGACGGTTACAACGAAGC 
CAGTTCGAGAGTCGTTGGCGAATGGGCGATTTGCGTTCAAGCGGAGGAGA 
GGATGTGTTTATCGCGCGGCTCTCACCGTTTACAATGTGCCTACAGCGCCT 
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GCTACAGTACCTCCAGTACAGACTCAAGCAGCAGT6GCTTCATCAT CATGG 
TCAAMTCMCGGATATGGTGCCGTGGMCATGMGTTCTTTGAAACTTTTG 
TTCGGGATTCACAGGATTCAGTTTCTCGATCTCTTGGCTTTGTACGCGGACG 
AATGGGACGAGGTGGGCGAGTTGTATTCGATCGGATGCCTCGCAATCGAG • " 
ACGACAACGACGAACGCACTTCGACAGATCCATGGGCCGAGTATTGT GTCG 
CGGATAGTTCMGGTGAGATTtnGMTMGMTC TTMTT TCACGAGATTTT 
GGTTTTTTTCGCTGCTTTTTCTGTMTTTTGTGGTATTTTTTCTCGT^ 
ATTAAAAMCGGGTTTTAMTMTTTTMCCTGAMTTtCGCTAAAAACCAAG 
AAATTTCATTAAAAMTGCMCAAAAAAAAAGACTGGAGGCACCACCGAATG 
GAGAACAGGAGAACCCAAAACCAC6CCCA TTTTT CCGTGCCGGGCGGCGA 
AAATTmGCAGMmGCTGCMTTTTTCGTmACAMCGAMCAACGAAG 
CTCTGMTGTGTTATTTCGGAGCTTCGTTGTTTCGTTTGTAAAACGAAAAATT 
GCAGCMTTTCTGCAAAAATTTGCGCGCGGCACGGAAA AATGG GCGTAGTT 
TTAGGTTCTCCTGTTCTCCTTTCGGTGGTGCCTCGAGTCI I I 1 1 CGCATTCTT ' 
MTGAMTTTCTTTGTTTTTOGCGAMTTTCAGGTTAAMTTAmAAAACCC 
GTTTTTTTTTCMTTGGAAATGCGAGGAAA AACCA CAA AATCAC AGAGAAAG 
CTTTTGGATTTTTTCGCAGCTTTTTCTGTG ATTTT GTGG 1 1 I I I CCTCGCATTT 
TC AATTG AAAAAAAMCGG GTTTTAAATMTTTTCACCTGAAATTTCGCTAAA 
AACGAGGAMTTTCATTACAAATGCAAAAAAGACTGGAGGCACCACCGAAA 
CCGAATGCAGCTCAGAACAGGATTTACCAAAACAGGATGCAG TAGGCGG AG 
CCAATTCGCAACCACCGCATGCTTATTTCGCATGCCTCGCACGI I I I I I I 1 1 I 
CTCTTGAAACAATGCAACAATATCAAGGAAAAAACGTGCGAGAC TTGCG AAA 
TAAGCATGCGGTGGTTGCGAATTGGCTCCGCCCACTGCATTCT GTTTTGG T 
AMTTCTGTTCTGAGCTGCATtCTGTTTTGTTGGGGCTTCCAGTCTTTTTTGT 
GCATTTTTAATGGAATTTCTTCGTTTTTAGCGAAATTTCAGGTT^ 
AAAACCCGTTTTTTTTTCAATTGGAAATGCGAGGAAAAACCACAAAATCACA 
GAGATAGC6AGGCCCCACGAAAAGGGGAGCAGA ACAAA AAAGGGGGGGG 
GGGGGCTGGCACTGTGCCAAACGCACAAAACGCTTTTTATTCTTATTCAACG 
CACGACTTTGTTATAACCACACTCCGTTATTACGCATCGCGCGCTGT TTAG C 
GTGAAMTACAAAAAAACGTC6TGCGTTGAATGAGAATAAA AAAG CGTTTTG 
TGCGTtTGGCACAGTGCCAGCTCTCCTTTTCGCAGATCCCCTTTTCGTGGG 
GCCTCAGAGAAAGCT6CCATAAACTTTTTTCT TCGCG CTAAGACCAATACCA 
ATAMTCCTTGCGCCTTTMTATGCAAACTATATTTTTCTTCCAGAACGTTCC 
GTGCTCGAAACAGTTCGCTTGGTACCGAAGAAGAAACCGATGATCTAAGCC 
CGAAATCTCTGTATTTCGCTCGCAGTAATCGGTTCGCATTCAACGATGATGA 
AACTGAACGGGAATGGACTTCAAGATGCCAACAATCATCGTGGAGAGATAC 
AGAGGTGGATGATGAGCTGAAAAAGCGGGAAACAACGTC TGAA AGTGAGAT 
TTTGMCGATTTACCTGGGAAAATAGATTATTTTGGGCCTATTTTAATTATTTA 
ATTGCAGAATTTACCGAAACCACGACGAATGGAAGTACCAAAACACACACA 
GAATCGGATGATAGTGAAGTTGAACGGATGGAGGTTGATGATCAAGTTGAT 
GAAGCTCAAATAAGTGTATCATCATCAAAAGACGATGGAATGAATGGAAATG 
ATAAGAACGAGGATGAAGAAGATGATGATGATGATATGGATGTAGATGAACA 
- TeAGAXS-TGT^G-T-GGGT-G-TGe-ATeAGeAGG^ 

AAAAAGTTCGGCATCAAATGAATGGTGGTGGTGGTGGTGGTGGAGTGGTAA 
AACTGAAACCGCCGCTGCAAGAACTTTCGCCGCCGCTTTCGGGAAACGGAA ■ 
GAGCGGACAGAGCGGAACCGACGCCGGTTCCGGCAAAGGTAGTGAGGCTT 
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TTTTTTTAMTACTCGAAAMGMG6AAAAAATCCCACTTTTAAAAATAC6AT 
TCTTAAAAATGCGAATTCC.CTCCAAAATGAGAACTCTGATTGGCCAGGGAGlC 
TCTCATTTTGAATGGAAATTAGTCAAAATTGAAAAATCCCG 1 1 1 1 1 1 1 I TTAAG . 

TTGGATTTTTCATTTTCTCGCGATTTTTTCCGCGTTTCTGTGTCATrCCtGAA 
TTTAACATTTMTAMTTAAAAATGTCTGGAATATTGACAAATTATGCTTCAAA 
TTTTTTGCGCGGGAGnCAAAMTMTTTGGCCCTTTTT^^ 
AAAATATATAAAAMTCATTTTAAAAMTTTAGAMCATTTTTTM 1 'I \ 1 1| I AA 
.CAGTTATATTCGCTATATTGGGACGGTATTCTGTCATTAAACTTGGTGTTGTC 

gaai nrrm i attgctttatmgactcaaaattgtctgaaaacaccgaatttt 

ATAATGAMCTTCTTGGAMCTTCTCAAAAAAAAGTTATGACGGCTCAAAAAA 
TGACCTAAAATTTGTTAAMTTTGAAATTTGACTTGTCGCAACGGCTGGAAAC 
AATT1 1 I 1 1 1 1 1 1 GAMTCACCGTCAAATTTTGAGTATAAMTTTAATTATTTTG 
CGTTTTCAACTCGATTTTTGGTATTTTCAAGTCGATGGACGGCAAGATTTG6 
TTAAAAAATTAAAAGCCGTCCATTTTCTCGCCGTCCATTGACTTTAAACTACC 
TAAATCGAGTTGAAMCGCMGATMTTGACATTTATACCCAAAATTTGACTG ■ 

• TG GTTTTAAAAMGnAGTTTCCAGCCGCTGCGACMGTCAAATrTCCAATTT 
TAACTATTTTAGGCCATTTTTTGAGCCATCATAAC I 1 1 1 I 1 1 1 GAGAAGTTTTT 

AAGAAGTTTCATCATGAMTTCGGTGTTTTCAGACAATTTTGAGTCTAATAAA 
GTAATTTTAAAAMTTCGACAGACACCACCTTTATAGCAATTTTGAATTTTTTT 
TTAAACTTGTCTTGAAAMTCTTGAAAAAAGTCGAATAAATTCCCATTTTCCT 
ATTTTCTT7TTTGCAGATGTGCGGAAGGGTGTGGGACTCAGATGATTGGAGA 
GAGCCGAGTGGATCACCATCAGAATCGAATTGATCAACCGAATGGGGTGGC 
TATACGCCACAAGAACAGCATGCAGTTGTTGTTGCCAACGCGGTAGCTGTC 
GCTTTCAAGGAAAAATTGATGAATGGCGTGGATGATGAT6ATGATCAACAAC 
CATCGCCGGCTA GAGG AGCACGAGATCATTCCATCAAAGAGTTCGTTAGTT 
TTTCTTTGCT1 1 I I I 1 1 1 1 1 TTGATTTTTGAGAGCAAATTTGAAAAGTTTTACA 
C G GTTTTTGAAAAACT GTT G AAATT AAAATTT GTT G AG AATTT GATTTC G AG C 
. AAGTJTTA I I 1 1 I AAAAAATTGAATTTTTCAGAAAATTCTGAGTTTTCTTTTTAA 
AAAATTGAAATTTTCAGAAAATTCTGAGTAGCAAGAATCTTTAAGATCCTTAA 
TTrCTATGCAAGAATACGTAGGAGTTTTACTTTGCTCAGGAMTTTTATTTTTT 
GTCAG AGGAG TATATCC6AAAAAGAACAAAAAAAATGCACATTTCTCAAAAC 
GCGTAI 1 1 1 I im CAGTTCGATGTCAACGGTAACACTGCTGGAACGGAAAA 
AGTTCATGATGCCGTCGACAATCGGTCTA^grTTGAACTCTCTGCTGCTGC 
TTCTGCTACTGCTGCTACTGCTGCTCATCGCCAATTTTCAATCCTCCTGAGA 
TTTTTTGATGGTCATTCATTGTTTTGTGCATATCTCTCTCTCTCTCTCTCTCTC 
CCATGATTCTCAAATATTTCAATGTATTTACACCCCCACTCTGTCCGCTGCCT 
MTCCCCGACCGAATAATCAGATTCGCTGGAAAAATCTGCGATTCTTTAATA 
TTGCAACCACCCACCCAATAATATGTGTCTCATCATCTCGGTACTCTCACTT ' 
GAGCCGTGTTTTCTGTAGTATTTTATTCTCTAAAAAAAAATCATTTTTAATATA 

• ATATACGTACACATTTATATCTGTAATATATATTTTTAAAAATGATTCCCCCCT 

CCCCTCCATTCGTTGTTTTTTTTCTGTGGGTTTCAAGCTTTTGAGCTGTGAAA 

AATCTCATCCCAT CATCAT TTTCTATTGTTTTTTTTCACAGTTGAAATATCCTA 
TTTTATCTTTTTeeiTrTTTT^ 

TTTCGTCCCGCGAAACGCCCGCCGCCGCCCAATCCCACTCTCTCTCTCAGT 
CTCTTCTTAATGATCTTCGAAACTATTTTTATTTCCCTCATTAACAATTACGAG 
GTCGTCTTTTTTTTTCCCCACCCCCCACTGTTTGGTGTAATTTTTGTGTTCGG 
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GGAGGTTTTTTGTGTGTGGATTTTTGGA 1 1 1 1 HGGAI I I I I I CAACAAAAAA • 
TTCCCCCGAMTCAAMTTTTTrCCCATTTTCCCCTCMTATTAGTACTGTTG' ' 
TATAAAT AAACT TGCTCTCTCTCTCTCTCTCGAAATCTCCTACTATT AI 1 1 \ I [ 
TAAMGATTTTTCCAACAAAMTTCAAAAAACCACACAAACGACCTCTCTGCA' 
CGCGGTMTCCTCTCTCTTTTTGTCCCCCATTTTCTCTGTTTCTC' 1 1 1 II II CT 
ATCCCCTATACCTGTGATTGGAATATG . 
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onc-7 ur\r 

ATGGCCACTACTTCGAAGGCGTTTC6AQCCCGGGCGCTCGACTCGMCCG 
GTCTATGACTGTATACTGGGGCCACGAACTTCCGGACCTATCAGAATGCAG ' 
TGTTGGAAACCGGGCGGTGACACAAATGCCGTCTGGCATGGAAAAAGAAGA 
AGAAGAGGAAAAACACCTGCAAGAAGCGATTGCTGCCCAGCAAGCCAGTAC 
ATCGGGTATTCAGCTGAACCATGTCATTCCAACTCCAAAAGTCGACCGAGTC 
GAAGATCAACGCTATCACTCCACTTATCACAACAAGAATAAAATGCACCGTT 
CAAAGTATATCAAAGTTCATGCCTGGCAAGCACTCGAACGAGACGAACCCG 
AG'TATGACTACGACACAGAAGATGAAGCATGGCTATCAGATCACACTCACAT 
TGACCCGCGCGTTTtGGAAAAGATATTCGACACAGTGGAGAGCCATTCATC 
GGAGACACAGATCGCGAGCGAAGATTCGGTGATTAATTTGCATAAATCACT 
GGACTCATCAATCGTGTACGAAATATACGAATATTGGCTGTCGAAGCGAACA . 
TCGGCtGCGACGACGTCTGGTTGTGTTGGAGTCGGTGGATTAATTCCGAGA 
GTCAGGACAGAATGTCGGAAGGATGGACAAGGTGTTATCAATCCGTACGTT 
GCATTCCGTCGACGTGCCGAGAAAATGCAGACTCGAAAGAATCGGAAAAAC 
GATGAAGATTCGTATGAGAAGATTCTCAAGTTGGTACATGACATGTCGAAAG 
CTCAACAGCTCTTCGATATGACTGCeCGACGAGAAAAGCAGAAGCTCGCGT 
TGATTGATATGGAATCGGAGATTTTAGCGAAACGAATGGAGATGTCAGATTT 
TGGTGGTTCTCCGAGTTCGTTCAATGAGATCACCGAAAAGATTCGAGCAGC 
• AGCAACGTTGGAAGTCGTGAAACCACCACTGGCAGAAATCAACGGATCAGA 
TGMGTGMGMGAGGAAGAAGCCGAGACGAAAGATTGCTGATAAGGATTT 
AATATCGAAAGCCTGGCTTAAAAAGAATGCAGAAAGTTGGAATCGGCCGCC 
GTCGCTCTTTGGACAACACAGTGGAAATGTTCCGACGGTTACAACGAAGCC 
AGTTCGAGAGTCGTTGGGGAATGGGCGATTTGCGTTCAAGCGGAGGAGAG 
GATGTGTTTATCGCGCGGCTCTCACCGTTTACAATGTGCCTACAGCGCCTG 
CTACAGTACCTCCAGTACAGACTCAAGCAGCAGTGGCTTCATCA TCATC GTC 
AAMTGMCGGATATGGTGGCGTCGMCATGAAGTTCTTTGAAACTTTTGTT 
CGGGATTCACAGGATTCAGTTTCTCGATCTCTTGGCTTTGTACGCCGACGAA 
TGGGACGAGGtGGGCGAGTTGTATTCGATCGGATGCCTCGCAATCGAGAC 
GACAACGACGAACGCACTTCGACAGATCCATGGGCCGAGTATTGTGTCGCG 
GATAGTTCAAGAACCTTCCGTGCTCGAAACAGTTCGCTTGGTACCGAAGAA 
GAAAGCGATGATGTAAGCCCGAAATCTCTGTATTTCGCTCGCAGTAATCGGT 
TCGCATTCAACGATGATGAAACTGAACGGGAATGGACTTCAAGATGCCAAC 
AATCATCGTGGAGAGATACAGAGGTGGATGATGAGCTGAAAAAGCGGGAAA 
CAACGTCTGAAAAATTTACCGAAACCACGACGAATGGAAGTACCAAAACACA 
CACAGAATCGGATGATAGTGAAGTTGAACGGATGGAGGTTGATGATCAAGT 
TGATGAAGCTCAAATAACTGTATCATCATCAAAAGACGATGGAATGAATGGA 
AATG ATAAG AAC GAG G ATG AAG AAGATGATG AT G AT G ATATGGATGTAG ATG 
AACATCAGACTGTCGTGGGTGTGCATCAGCACCAGCAGCAGCAGCATCACC 
AGCAAAAAGTTCGGCATCAAATGAATGGTGGTGGTGGTGGTGGTGGAGTG ■ 
GTAAAACTGAAACCGCCGCTGCAAGAACTTTCGCCGCCGCTTTCGGGAAAC 
GGAAGAGCGGACAGAGCGGAACCGACGCCGGTTCCGGCAAAGATGTGCG 
GMCGGTGTCGGACTCAGATGATTGGAGAGAGGCGAGTGGATCACCATCA 
GAATCGAATTCATCAACCGAATGGGGTGGCTATACGCCACAAGAACAGCAT 
GCAGTTGTTGTTGCCAACGCGGTAGCTGTCGCTTTCAAG.GAAAAATTGATG 
AATGGCGTGGATGATGATGATGATCAACAACCATCGCCGGCTAGAGGAGCA 
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CGA6ATCATTCCATCAAAGATTC6ATGTCAACGGTAACACTGCTGGAACGG 
AAAAAGTTCATGATGCGGTCGACAATCGGTCTATAA 
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FIGURE 18 

EPC-1 protein- 

MATTSKAFRA^LDSNRSMTVYWGHELPDLSECSVGNRAVTQMPSGMEKEE 

EQEKHLQEAIAAQQASTSGIQLNHVIPTPKVDRVEDQRYHSTYHNKNKMHRSk 

YIKVHAWQALERDEPEYDYDTEDEAWLSDHTHIDPRVLEKIFDTVESHSSETQI 

ASEDSVINLHKSLDSSIVYEIYEYWLSKRTSAATTSGCVGVGGLIPRVRTECRKD 

GQGVINPYVAFRRRAEKMQTRKNRKNDEDSYEKILKLVHDMSKAQQLFDMTAR 

REKQKLALIDMESEIl^KRMEMSDFGGSPSSFNEITEKIRAAATLEWKPPLAEIN 

GSDEVKKRKKPRRKiADKDLISKAWLKKNAESWNRPPSLFGQHSGNVPTVTTK 

PVRESLANGRFAFKRRRGCVYRAALTVYNVPTAPATVPPVQTQAAVASSSSSK 

STDMVPSNMKFFETFVRDSQDSVSRSLGFVRRRMGRGGRWFDRMPRNRDD 

NDERTSTDPWAEYCVADSSRTFRARNSSLGTEEETDDLSPKSLYFARSNRFAF 

NDDETEREWTSRCQOSSWRDTEVDDELKKRETTSEKFTETTTNGSTKTHTES 

DDSEVERMEVDDQVDEAQITVSSSKDDGMNGNDKNEDEEDDDDDMDVDEHQ 

TVVGVHQHQQQQHHQQKVRHQMNGGGGGGGWKLKPPLQELSPPLSGNGR 

ADRAEPTPVPAKMCGTVSDSDDWREPSGSPSESNSSTEWGGYTPQEQHAW 

VANAVAVAFKEKLMNGVDDDDDQQPSPARGARDHSIKDSMSTVTLLERKKFM 

MPSTIGL 
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ssl-1 Genomic 

cagctgatgt* tgttgatgga aaaatgacgg ctgcaaagaa gccattggct gcaactgagc 60 
caaaagtgca taataaataa atgtgtttct aggatcttct aataattttt tttctgtttt 120 
ctagctctaa acttgtattt atttcattct tgttctacca aattcccacg gattctacgc 160 
tttatgtttc taaattatta ttctttttta tttatatctg cattttcttc taaaaactct 240 
ggtcattttc ttgttttttt cttggtaatt ataaaaatta gtcatacaaa tcttgttaaa 300 
tatctggcta ttcagtgaac aaaccatttt ccgctctaaa ttcgacccga atcaatcgaa 360 
aaatggctca aaacgatgcc atctggctgc aacccccctg* tcgtctctca attttgtgta 420 
ctctctcgca gccacgcacg cgacgcaacg cactcgcgtc gcggtcgcag ttctttttca 480 
aatttatcgc gccatttttg ttttgcctca tatttatcgg ctcacgattg attttcgtcg 540 
aaaaacgcgc ttaatcgatt cctttttacc tgaaaaatgt tgttccaatt ggaaaaccag 600 
ttgaagatcg atgaattttc aagaaaatca ttcaaatagg caaaacccgc tgaactttga 660 
aattcgattt ttgagttttt tgaagaaaat ataattattt catcatttat gttggtcctg 720 
ttggtcctca gcatagaaaa ttcggacatg acattagaaa ttcataataa ctgctcccaa 780 
tatcgggatt agaacgattt tcagctcaaa atatggaaaa ttggttacat aaaccgcata 840 
tttgtagcat taatcttgaa cagctatatg gcattaaaaa aaaatatata tatacattgt 900 
tttttctctc gaagtttctc tttttgtttc taaaatccgg aatataattt aaaaaaccac 960 
ataaatttca atttgcagta cgagttcccc ccgaatcaca atg ccg gca aca ccg 1015 

Met Pro- Ala Thr Pro 
1 5 

gtg cgt get tea agt act cga ata age aga cgt aca tea tea aga tea 1063 
Val Arg Ala Ser Ser Thr Arg He Ser Arg Arg Thr Ser. Ser Arg Ser 
10 15 20 

gtg get gat gat cag cca tea act teg tct gcg gtg get cca cct cct 1111 
Val Ala Asp Asp Gin Pro Ser Thr Ser Ser Ala Val Ala Pro Pro Pro 
25 30 35 

tea ccc att gee ata gaa act gat gaa gat gcg gta gtt gag gag gag 1159 
Ser Pro He Ala He Glu Thr Asp Glu Asp Ala Val Val Glu Glu Glu 
40 45 .50 

aaa aag aag aaa aag aca tea gat gat ttg gaa att ate act cca aga 1207 
Lys Lys Lys Lys Lys Thr Ser Asp Asp Leu Glu He He Thr Pro Arg 
55 60 65 

act cca gtc gat egg cga att ccc tac att tgc teg att ctt ttg act 1255 
Thr Pro Val Asp Arg Arg He Pro Tyr He Cys Ser He Leu Leu Thr 
70 75 80 85 

gaa aat cga teg att cgc gat aaa tt gtacgatttt ttaaatttaa 1301 
Glu Asn Arg Ser lie Arg Asp Lys Leu 
90 

ttactttcct caaatccgaa taattattag atcgcgcttc gcgtttctgc atecgeggta 1361 
ttttgectte ccactgaaaa tagcagattt atcgaatttt tagcttaaaa aaaaaatgtt 1421 
ttttctgeat ttttcaaaca aaccttttgt aaaacagtga aaatcgaatt tcaaatgact 1481 
aaaatgaatt ttttttttgt ccactggttg tggaatggtt tgaatttgaa gaaatcagcg 1541 
ggatttttcg. .tattttctga. atatttttct^ attaaaaatc ggtttcaaac cattttttga 1601 
cttttgaata gaaaaatatt gagaaaatac gaaaaatcca gctaacttcc agcttgttca 1661 
aattcaaacc attccacaac cagtggacga aaaaagttca ttttagtcat ttgaaattcg 1721 
atttggtttg tttgaaaaat gcaaaaaaaa aatatttttt aaagctaaaa atttgataaa 1781 
tctgaaaaaa atetgetatt ttcagtggaa aggcaaaata ccgcgaagcg cagcaagcgc 1841 
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gctctaataa ttattccgct tcgagaagag cgtgtattat ttcattgtta catttcaaaa 1901 
ttatgaatta atgtttttca g g gtt ctg age age ggt cca gtt cgt caa gaa 1953 

Val Leu Ser Ser Gly Pro Val Arg Gin Glu 
95 100 

gat cac gaa gaa cag att get cga get caa egg ata cag cca gtt gtc 2001 
Asp His Glu Glu Gin He Ala Arg Ala Gin Arg He Gin Pro Val Val 
105 HO 115 120 



gat caa att caa cga gtc gag caa at gtatgtgaag ctgaaaaatt 
Asp Gin He Gin Arg Val Glu Gin He 
125 



2047 



gcaccacaaa tcaattattc taatcttgtt ttacag c ata etc aat ggt tea gtg 2102 

He Leu Asn Gly Ser Val 
130 135 



gaa gat att ctg aaa gat cct cga ttc gca gta atg gca gat etc aca 
Glu Asp He Leu Lys Asp Pro Arg Phe Ala Val Met Ala Asp Leu Thr 
140 145 150 



2150 



aaa gaa cca cca cca aca cct gca cct cct cct cca ate cag aag aca 
Lys Glu Pro Pro Pro Thr Pro Ala Pro Pro Pro Pro He Gin Lys Thr 
155 160 ' 165 



2198 



atg caa ccg att gag gtg aaa att gag gat tea gag ggc tea aat acg 
Met Gin Pro He Glu Val Lys He Glu Asp Ser Glu Gly Ser Asn Thr 
170 175 180 



2246 



get caa ccg agt gtt ctg ccc agt tgt gga gga gga gag acg aat gtg 
Ala Gin Pro Ser Val Leu Pro Ser Cys Gly Gly Gly Glu Thr Asn Val 
185 190 195 



2294 



gaa aga gee gec aaa aga gtgagttrttg aagatagatt -ggtgtgtaaa 
Glu Arg Ala Ala Lys Arg 
200 205 



2342 



aaatgaatgt 

tetaggecat 

gcctgttttc 

aaatattttt 

aagcagcggt 

ttttctgaaa 

gaeegctgaa 

tttcgcacaa 

aatcaggtaa 

ccgtaatcca 

tttttacttg 

aaaaaacgaa 

atggccgctg 

tcgtgaggaa 

aaggcgcatg 

gttcggaaaa 

acaaaagaaa 

aatttcaacg 

tagaattgea 

atgatgaaaa 



ttatatattc 

ggccgaggtg 

tttcgttttt 

tgeagatget 

gaaagtggtc 

catgatacat 

aaagttttga 

aaagttgaat 

tttcagcatc 

teatattgea 

gaaattgttt 

aaaaatcggt 

aaacttgtcg 

aaagttgcag 

gatttattca 

tgaggatttt 

aacggaaaaa 

ctegctaata 

tccgcgctgt 

aatgagacaa 



actgeaaett 
ccgacaagtt 
catcgatttt 
aaaacaattt 
aatgeaatat 
aegctgetta 
ggttttcaaa 
tctgaaaacc 
atatgtatca 
ttgaccactt 
tagcatctgc 
gaaaaacgaa 
gcccctcggc 
tgttattgta 
gecctaaaat 
actt-tasaat 
aattcatcaa 
ttcctaattt 
ttccttcctc 
aactagaatt 



tttcctcacg 

tcagcggcca 

tttcgttttt 

ccaagtaaaa 

gatggattac 

aatgetgaga 

attcaaattt 

tcaaattttt 

tgtttcaaaa 

tcaccgctgc 

aaaaaatatt 

agaaaacagg 

catggectag 

aatctcacaa 

taaataaatc 

gctcaaacta- 

gtttgaaaaa 

gaaccgcgct 

ttccggcgcc 

cacgtagcgc 



agggacgagg. 
tttatcttgc 
tcttaataaa 
aaattatgta 
gggaatacaa 
ctacctgatt 
tttggtgaaa 
ttcagcggtc 
aaagtttagg 
ttgcccactg 
tatttatcag 
eggaaaacaa 
aaaccacttt 
gagtctggca 
catacgactt 
gtcccaaatg 
aatgcggatg 
tttgtccgcg 
ctact'tcttt 
gtcggaaatg 



aaaagtggtt 
tttgttttcc 
actgataaat 
ttcagtgggc 
aacctaaact 
ttcataacga 
aagtcgagat 
tcgttatgaa 
ttttgtattc 
aatacatgat 
ttttattaag 
agcaagataa 
tcctcgtccc 
tgatttctca 
taaaggtgga 
ccgaattacc 
attttgttga 
ccgcactctg 
tcgattggaa 
atgaaaatat 



2402 
2462 
2522 
2582 
2642 
2702 
2762 
2822 
2882 
2942 
3002 
3062 
3122 
3182 
3242 
3302 
3362 
3422 
3482 
3542 
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catggatgca gcagatctac ggagtgcggc gcggacaaac ggcgcggtaa ttcaaatgag 3602 
gaatattagc gagagttgaa atttcaacaa aatcagccgc atttttttca aacttaatgt 3662 
attttttttc gtttttcttt tgtagtaatt cggcatttgg ggctagtgta agcattttaa 3722 
agtaaaatcc tcattttccg aactccacct ttaaaggtgg agtaccgaaa tttgagactt 3782 
tgctttttta ggcccaaatt ggtccaaaac taccgaattt tgtaatgaga cgttctgaaa 3842 
atttatccaa aaaat'gttat ggcggttcaa agttcggcaa aatagggccc attttcagct 3902 
aaaatcaaat ttttttttcc aadtttttcg gtgtcgcaac gtctggagcc taatttttat 3962. 
ttattaatca ctttttaata aatattgtag cctttgatta ggcgtttatt cgctgattta 4 022 
agtacattta tggtttttgg ggcacaaata aaagtttcat tttatgcccc aaaaaccata 4082 
aatgtactta* aatcagcgaa 'taaacgccta atcaaaggct acaatattta ttaaagagtg 4142 
atgaataaat aaaaattagg ttccagacgt tgcgacaccg aaaaagttgg aaaaaatttt 4202 
gattttagct gaaaatgtgc cttattttgc cgcgaacttt gaaccgccat aacttttttt 4262 
gagaaagaaa ttttcagaac gtctcattac gaaattcggt agttttaaac caatttgggt 4322 
ctaaaaagtt tcaaattcca ataaaacata ccaaagtctt gtgaaattac aataaactat 4382 
tcctaaacgt attataatcc attctcaatt cttgcag gaa gcg cat gta ttg get 4437 

Glu Ala His Val Leu Ala 
210 

cga ate gee gag etc cgt aag aac ggc tta tgg teg aac agt cgt ctg 44 85 
Arg He Ala Glu Leu Arg Lys Asia Gly Leu Trp Ser Asn Ser Arg Leu 

- ~ 215 — 220 - — - 2-25 - 

cca aag tgc gtc gaa cct gaa cgt aat aaa acg cat tgg gat tat eta 4533 
Pro Lys Cys Val Glu Pro Glu Arg Asn Lys Thr His Trp Asp Tyr Leu 
230 235 240 

ctg gaa gag gtc aaa tgg atg gca gtt gat ttc cga ace gag acg aat 4581 
Leu Glu Glu Val Lys Trp Met Ala Val Asp Phe Arg Thr Glu Thr Asn 
245 250 255 

acg aag cga aaa ate gec aaa gtt ata get cac gee att gcg aaa cag 4629 
Thr Lys Arg Lys He Ala Lys Val He Ala His Ala He Ala Lys Gin 
260 265 270 275 

cac cgc gac aag cag ate gag att gag aga gee gee gaa egg gag ate 4677 
His Arg Asp Lys Gin He Glu He Glu Arg Ala Ala Glu Arg Glu He 
280 285 290 

aag gag aag cga aaa atg tgt gca gga ate gcg aag atg gta egg gat 4725 
Lys Glu Lys Arg Lys Met Cys .Ala Gly He Ala Lys Met Val Arg Asp 
295 300 305 

ttc tgg teg tct acg gat aaa gtt gtg gat att cga gcg aag gaa gtt 4773 
Phe Trp Ser Ser Thr Asp Lys Val Val Asp He Arg Ala Lys Glu Val 
310 315 320 

ctg gag teg agg etc agg aag gcg aga aat aag cat ttg atg ttt gta 4 821 
Leu Glu Ser Arg Leu Arg Lys Ala Arg Asn Lys His Leu Met Phe Val 
325 330 335 

att gga caa gtc gat gaa atg age aat att gtg caa gaa gga ctt gtt 4 869 
He Gly Gin Val Asp Glu Met Ser Asn He Val Gin Glu Gly Leu Val 

34-0- ~ -•■ .-*t— — - 34.5— — — - . ■ 35-0- — 355... . 

tea teg teg aaa tec cca tea att gca teg gat cga gat gat aaa gat 4917 
Ser Ser Ser Lys Ser Pro Ser He Ala Ser Asp Arg Asp Asp Lys Asp 
360 365 " 370 
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gaa gaa ttc aaa gca cct ggc tct gat tea gaa tct gac gat gag cag 4965 
Glu Glu Phe Lys Ala Pro Gly Ser Asp Ser Glu Ser Asp Asp Glu Gin 
375 380 385 

aca att gca aac gcg gaa aag tea cag aaa aag gaa gat gtt cga cag 5013 
Thr lie Ala Asn Ala Glu Lys Ser Gin Lys Lys Glu Asp Val Arg Gin 
390 395 400 

gaa gtt gat get ctt caa aac gag gca act gtg gat atg gat gac ttt 5061 
Glu Val Asp Ala Leu Gin Asn Glu Ala Thr Val Asp Met Asp Asp Phe 
405 410 415 

ttg tac act tta ccg ccg gaa tat ctg aag get tat ggt ctg acg cag 5109 
Leu Tyr Thr Leu* Pro Pro Glu Tyr Leu Lys Ala Tyr Gly Leu Thr Gin 
420 425 430 435 

gag gat ttg gag gag atg aag cgc gag aaa ttg gag gag cag aag get 5157 
Glu Asp Leu Glu Glu Met Lys Arg Glu Lys Leu Glu Glu Gin Lys Ala 
440 445 450 

egg aag gaa get tgt ggt gat aat gag gag aaa atg gag att gat gaa 5205 
Arg Lys Glu Ala Cys Gly Asp Asn Glu Glu Lys Met Glu He Asp Glu 
455 4 60 4.65 

gttcgtagga tgctcctaaa aaaattacct aaaaaaaatc gattttccct ggaaaaaatc 5265 
ctctggaaat gacccgaaac gtcatggcgg ctcgaaattt tgaaaaaaaa aaccccccaa 53.25 
atttccagcf aaaatctcaa attttattgc atattttggt agttcttttg ttgtccgagg 5385 
tgcgtttttc agctgaaaat gtacctgaat ctgcaagtaa acgaccaata tatgeaataa 5445 
atgatgataa ttaatttccg atactgaaat gtgggcgaaa tttgagattt cgactgaaaa 5505 
cgtcttaaaa atcacccaaa acccggcttt accgcacgaa ggtttgaaga aaatggccaa 5565, 
tttttageca aaatctcaaa tttcgtccac ttttcagtca gaaattagtt ttttgaaatt 5625 
aattaacacc ttttattgea tattttegtc gtttattcgt tgatcgaggt getttttegg 5685 
tcgatgggtg cacaaattcg gtaattgtgc atccatcggc tgaaaatget ccagaatttg 5745 
cgaatgaacg gtgaaaattt aagattttag attgaaataa gccgtttttt agagaaaatt 5805 
ggtcgttttg agacattaaa ttcaatttaa atcccctctt tattttcag age cca tea 5863 

Ser Pro Ser 
470 

tea gat get caa aag cct tec ace tea age tea gat etc ace gec gag 5911 
Ser Asp Ala Gin Lys Pro Ser Thr Ser Ser Ser Asp Leu Thr Ala Glu 
475 480 485 

cag ctt caa gat cca aca get gaa gac ggc aac ggt gat ggt cat ggt 5959 
Gin Leu Gin Asp Pro Thr Ala Glu Asp Gly Asn Gly Asp Gly His Gly 
490 . 495 ^ 500 

gta ctt gaa aac gtg gat tac gtg aag etc aac agt. cag gat agt gat 6007 
Val Leu Glu Asn Val Asp Tyr Val Lys Leu Asn Ser Gin Asp Ser Asp 
~ 505 ' ' 510 " 515 

gaa cga caa caa gag ttg gcg aat ate gca gaa gaa gcg ctg aaa ttc 6055 
Glu Arg Gin Gin Glu Leu Ala Asn He Ala Glu Glu Ala Leu Lys Phe 
520 525 530 

cag cca aaa gga tat aca ctt gag acg aca caa gtc aag acg ccc gta 6103 
Gin Pro Lys Gly Tyr Thr Leu Glu Thr Thr Gin Val Lys Thr Pro Val - 
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535 



540 



545 



550 



cca ttc ctg att cga gga caa ctg aga gaa tat caa atg gtt gga ttg 
Pro Phe Leu He Arg Gly Gin Leu Arg Glu Tyr Gin Met Val Gly Leu 
555 560 565 



6151 



gat tgg atg gtt aca ctt tat gag aag" aat ttg aat gga att ctt gcc 
Asp Trp Met Val Thr Leu Tyr Glu Lys Asn Leu Asn Gly He Leu Ala 
570 575 580 



6199 



gac gag atg ggc ctg gga aag acg att caa acg att tec ctg ctg get 

Asp Glu Met Gly Leu Gly Lys Thr He Gin Thr He Ser Leu Leu Ala 

585 590 595 

cat atg get tgt agt gaa teg att tgg gga cca cac ttg att gtt gtg 

His Met Ala Cys Ser Glu Ser He Trp Gly Pro His Leu He Val Val 

600 605 610 



6247 



6295 



ccg acg tct gtc att ctg aat tgg gag atg gag ttc aag aaa tgg tgt 6343 

Pro Thr Ser Val He Leu Asn Trp Glu Met Glu Phe Lys Lys Trp Cys 

615 620 625 630 

ccg get ctg aag att ttg acg tat ttt ggt acg gcg aag gag cgt gcc 6391 

Pro Ala Leu Lys lie Leu Thr Tyr Phe Gly Thr Ala Lys Glu Arg Ala 

635 640 645 



gag aag egg aag gga tgg atg aag ccg aat tgt ttc cat gtg tgc ate 
Glu Lys Arg Lys Gly Trp Met Lys Pro Asn Cys Phe His Val Cys lie 
650 655 660 



6439 



aca tea tac aag acg gtt act caa gat att aga get ttt aag cag agg 
Thr Ser Tyr Lys Thr Val Thr Gin Asp lie Arg Ala Phe Lys Gin Arg 
665 670 675 



6487 



gtgcgtagaa 
ccaattttac 
attagtataa 
tttttgaaca 
gatatgaatc 
aaaatggccc 
caaaatttgc 
ggtttatttt 
tatactaatt 
ataaatiggtt 
gctttgtttt 
ctctgaaaat 
ggcccatttt 
acctaatttt 
attegttgat 
agaccataaa 
taaaagtaat 
atttttttat 
ttttttgaga 
acaaagtctc 



attttgaaga 

egataattge 

tttttgcaaa 

atttttaaga 

geccgaaaat 

aaaattgtct 

caaattttcc 

agcgttattt 

atagcaattt 

tttttccaaa 

tttttttgga 

ttctttctca 

cagctaaaat 

tatttattca 

ttaagtacat 

tgtactttaa 

gaataaataa 

tttagctgaa 

aegtctegtt 

aaatttcttg 



tttgeggega 
gaaatttttc 
aattggtact 
ggtttaataa 
gtcccccaat 
caaaatttcg 
gtctcacgga 
cgttaattta 
ctgacccctg 
tttttaaagc 
cccaaattgg 
aaaaaaaagt 
caaaattttt 
tcacttttta 
ttatggtcag 
tcaacgaata 
taattaggtt 
taagggcett 
acgaaattcg 
ttagagattt 



atttggcgaa 
aattttatac 
tttttcgaaa 
egaaattegt 
agacctaatt 
aaaaaaaaac 
gatcagaaaa 
gatacatttt 
acaaactttg 
gatattaaag 
tccaaaacta 
tacggcggtt 
tcccaacttc 
ataaatattg 
tggggcacaa 
aacgcccaat 
ccagacgttg 
attgtctcaa 
gtagttttgg 
tttaaaaatt 



tttgcataat 
agtggtcgga 
ttttgaacca 
tcatttgaac 
tcttaacaaa 
egtaatttea 
agttttttgc 
ageccaattt 
aaattategg 
gtggagtacc 
ccgaatttcg 
caaagttege 
tcggtgtctc 
tggtctttga 
aatgtaactt 
caaagaccac 
cgacaccgag 
actttgaacc 
accaatttgg 
gatatttttt 



ttttttaaaa 6547 
aattgetata 6607 
ccataaaaca 6667 
acattttggc 6727 
aatttaaaaa 6787 
gctgaaatct 6847 
atttttttgt 6907 
ttgcaaaaat 6967 
taaacttggt 7027 
acaatttgag 7087 
taatgagacg 7147 
ggcaaaataa 7207 
aacgcctgga 7267 
ttgggctttt 7327 
tttttccc.aa 7387 
aatatttatt 7447 
aagttggaaa 7507. 
gecataaett 7567 
gtctaaaaaa 7627 
ttttcag gcc 7687 
Ala 
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tgg cag tac eta att etc gat gaa get caa aat ate aaa aac tgg aag 7735 
Trp Gin Tyr Leu lie Leu Asp Glu Ala Gin Asn He Lys Asn Trp Lys 
680 685 690 695 

tec caa cgt tgg cag get ctt ctg aat gtc cgt get cga cgt cgc ctt 7783 
Ser Gin Arg Trp Gin Ala Leu Leu Asn Val Arg Ala Arg. Arg Arg Leu 
700 '705 710 

etc ctg ace gga act cca ctt cag aac tct eta atg gaa ctg. tgg teg 7831 
Leu Leu Thr Gly Thr Pro Leu Gin Asn Ser Leu Met Glu Leu Trp Ser 
715 720 725 

ttg atg cat ttt ttg atg cca aca ata ttc tea agt cat gat gat ttc 7879 
Leu Met His Phe Leu Met Pro Thr He Phe Ser Ser His Asp Asp Phe 
730 735 740 

aag gat tgg ttc teg aat ccg ttg aca ggg atg atg gaa gga aat atg 7927 
Lys Asp Trp Phe Ser Asn Pro Leu Thr Gly Met Met Glu Gly Asn Met 
745 750 755 

"gaa' ttc aat get cca eta ate gga cga ctt cac aaa gtg etc cgt ccg 7975 
Glu Phe Asn Ala Pro Leu He Gly Arg Leu His Lys Val Leu Arg Pro 
760 765 770 775 

ttt att ctg egg egg etc aag aag gaa gtt gag aag cag ctg cca gag 8023 
Phe He Leu Arg Arg Leu Lys Lys Glu Val Glu Lys Gin. Leu Pro Glu 
780 785 790 

aag act gag cat att gtg aat tgt teg ttg tea aag egg cag aga tac 8071 
Lys Thr Glu His He Val Asn Cys Ser Leu Ser Lys Arg Gin Arg Tyr 
795 800 805 

ctg tac gat gac ttt atg agt cgt aga tea aca aag gag aat eta aag 8119 
Leu Tyr Asp Asp Phe Met Ser Arg Arg Ser Thr Lys Glu Asn Leu Lys 
810 815 820 

tct gga aat atg atg teg gtg etc aac att gtg atg caa etc cga aaa 8167 
Ser Gly Asn Met Met Ser Val Leu Asn He Val Met Gin Leu Arg Lys 
825 830 835 

tgt tgt aat cat ccg aat etc ttc gag ccg egg cca gtt gtt get ccg 8215 
Cys Cys Asn His Pro Asn Leu Phe Glu Pro Arg Pro Val Val Ala Pro 
840 845 850 855 

ttc gtc gtt gag aag ctt cag etc gat gtt ccg get cgt etc ttt gaa 8263 
Phe Val Val Glu Lys Leu Gin Leu Asp Val Pro Ala Arg Leu Phe Glu 
860 865 870 

att teg cag caa gat ccc tec tec tec tea get agt caa att ccg gaa 8311 
He Ser Gin Gin Asp Pro Ser Ser Ser Ser Ala Ser Gin He Pro Glu 
875 880 885 

att -ttc ~aat tta~ tec aaa" ate gge" tat' caa tct* tec gft-cga - tctrgca B359 

He Phe Asn Leu Ser Lys He Gly Tyr Gin Ser Ser Val Arg Ser Ala 
890 895 900 



aaa cca etc ate gaa gag ctt gaa gca atg age act tat ccg gag cca 



8407 
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Lys Pro Leu He Glu Glu Leu Glu Ala Met Ser Thr Tyr Pro Glu Pro 
905 910 915 

cga gca cca gaa gtt ggc gga ttt egg ttc aat egg acg get ttt gtt 
Arg Ala Pro Glu Val Gly Gly Phe Arg Phe Asn Arg Thr Ala Phe Val 
920 925 " 930 935 

gca aag aat ccg cat acg gaa gag teg gag gac gaa ggt gtt atg aga 
Ala Lys Asn Pro His Thr Glu Glu Ser Glu Asp Glu Gly Val- Met Arg 
940 945 950 

agt cgt gtt ctg gtgaattttt aggaaaattg agaaaatgat ctaattgttg 
Ser Arg Val Leu 
955 



8455 



8503 



8555 



aattttttaa agaatttatg ggccacaagc 
tttgccgaaa attttgattt ttggcgattt 
gatttgeegg aaattttgat ttttggcgat 
ccgatttgcc ggaaattttg. aattttggca 
tggcaatttg ccgaattgcc ggaaattttg 
ttttgatttt tggggatttg ccggaaattt 
aattttgatt tttggcaatt tgccgatttg 
atttgeegga aattttgatt tttggcaatt 
egatttgecg atttgeegga aaaacatttt 
aatattttca aattattcca aattttccac 
ccctgatttt atatttcagc ttaaaatege 
tcgatttact ggatttttaa tetttgtegg 
ttcatttcca tcggtttttg acgaagtttt 
aegttttatt atcaaaaaaa actagcaaaa 
ctccgacaaa aaccgacgaa aatgaaggaa 
tgataaagat taaaatcccg taaatcgaca 
tegegatttt aagctgacat acaaaaaaag 
tagatattcg aaatcagggg ggaaaatttg 
gattaaaaat attcaatttt tgttttctta 
tcag cca aaa cca att aat gga aca 
Pro Lys Pro He Asn Gly Thr 
960 



egatttgecg gaaattttga tttttggcga 
gecagaaatt ttgatttttg gcaattatcc 
ttgccagaaa ttttgatttt tggcaattat 
attttccgat ttgccggaaa ttttgatttt 
atttttggca atttgecgaa ttgccggaaa 
tgatttttgg caatttgect atttgtcgga 
teggaaattt tgatttttgg caatttgecg 
ttccgatttg ccaaaaattt tgatttttgg 
gtgagccaat tttctcgaaa tttgggcttc 
tgattccgaa tatctaagta aaaaaaaatt 
taattttege gtcagagacg acgtcatgtg 
atgctaattt ccgtttttca acgagtttcc 
ctttgaaaat atgttcttaa ggtcaattaa 
ttggctttaa aaacacattt tcacagaaaa 
accccccgtt tgaaaacaga aattagcatc 
catggegtet ggcgtctctg gcacgaaaag 
agggatatat ttttttacga atttttcaca 
gagaaatttg agaaaatttc tcagatttcg 
tattaaaaaa aaattaactt ttataatttt 
get caa cca ctt caa aat gga aat 
Ala Gin Pro Leu Gin Asn Gly Asn 
965 ' 970 



tea ata cca caa aat get cca aat cgt cca caa act tea tgc att cgt 
Ser He Pro Gin Asn Ala Pro Asn Arg Pro Gin Thr Ser Cys He Arg 
975 980 985 



8615 

8675 

8735 

8795 

8855 

8915 

8975 

9035 

9095 

9155 

9215 

9275 

9335. 

9395* 

9455 

9515 

9575 

9635 

9695 

9744 



9792 



tea aaa acc gtc gta aat aca gtt cca ctg ace ate tec acc gat cga 9840 

Ser Lys Thr Val Val Asn Thr Val Pro Leu Thr He Ser Thr Asp Arg 
990 995 1000 

agt ggt ttt cat ttt aat atg gee aat gtt gga aga ggt gtt gtt cgt 9888 

Ser Gly Phe His Phe Asn Met Ala Asn Val Gly Arg Gly Val Val Arg 
1005 1010 1015 



ttg gat gat tea gca cgt atg age cca ccg etc aaa cgt cag aag etc 9936 

Leu . Asp-Asp.-Ser Ala. Arg..Met_Ser_ErQ_..Pro Leu Lys Arg „ Gin. Ly si ..Leu 

1020 1025 1030 

acc gga act gca acg aat tgg agt gat tat gtt ccg cga cac gtt gtt 9984 
Thr Gly Thr Ala Thr Asn Trp .Ser Asp Tyr Val Pro Arg His Val Val 
!035 1040 1045 1050 
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gaa aag atg gaa gaa teg aga aaa aac cag ctg gaa att gtt cga agg 
Glu Lys Met Glu Glu Ser Arg Lys Asn Gin Leu Glu He Val Arg Arg 
1055 1060 1065 



10032 



cga ttt gag atg att cgt get ccg att att cca ctg gaa atg gtt gcg 
Arg Phe Glu Met lie Arg Ala Pro He He Pro Leu Glu Met Val Ala 
1070 1075 1080 



10080. 



ctg gtt cga gag gaa att att gca gaa ttt cca cgt ttg get gtg gaa 
Leu Val Arg Glu Glu He He Ala Glu Phe Prd Arg Leu Ala Val Glu 
1085 1090 1095 



10128 



gag gac gag gtt gtg cag gag agg ctt ttg gag tat tgc gag ttg ttg 
Glu Asp Glu Val Val Gin Glu Arg Leu Leu Glu Tyr Cys Glu Leu Leu 
1100 1105 1110 



10176 



gtg caa aggtagaatt ttgaaaatta ttactttget tttttttaaa ccaaaattgg 

Val Gin 

1115 



10232 



cccaaaacta 

gccgctcaaa 

ttctcggtgt 

gcaagctgaa 

ttgaaaattg 

ttcatcattt 

tttaatacga 

aactaccgaa 

caaagttegg 

ggtgtcacaa 

gtcttttatt 

taaaagttac 

taatcaaagg 

attgtgacac 

ctgaacttta 

ttcggtagtt 

tegtattaaa 

aaatgatgaa 

gatcaatttt 

tttttcagct 

atgagaaaaa 

aattttccag 

accgaatttt 

aatgataege 

atggcatatt 

aacttaattt 

atttcgttga 

aaataacaca 

agtttcaacc 

ccatttttct 

cattacgaaa 

eggtacttea 

aaaaaatttt 



ccgaatttcg 
gttcgggaaa 
cgcaacgtct 
aaatcaaagt 
atcgaaaaaa 
ttcttccaaa 
aaaaaattca 
tttcgtaatg 
caaaataagg 
cgtctggaac 
aggegtttat 
attttgtgcc 
ccacaatatt 
cgagaagtta 
aacegctata 
ttggaccaat 
aatttttttt 
aattttattt 
caataaaaaa 
tacaaaaaat 
atatataatg 
ttttcaggaa 
gagaegctge 
tctgaaaaat 
tttagctaaa 
ttatttaatt 
ttgagatget 
taataaaatg 
gctgcgacac 
aaaactttga 
attggtaggt 
cctttaaagt 
ttaaaaattt 



taatgagaca 

ataaggecca 

ggaactaaaa 

ttttttttcc 

ttcaaaattt 

tttagttttc 

attttagctc 

agacgttctg 

cccattttca 

ttaattttta 

ttgttgattt 

cacatgacca 

tattaaaaag 

aaaaaaattt 

actttttttt 

ttgggtctaa 

ttgaaaaaag 

aaaaaatcga 

ttttttgttt 

ggaaagtttc 

tcacaaaaaa 

aaaaatcgtt 

ttttttagac 

tttcaaaaaa 

atctcaaatt 

gtcattcatt 

ttttgtgcct 

ccaaaatatg 

cgctaagttg 

gcggtcacaa 

teggaccaat 

tttcaattta 

tttcttccga 



ttctgaaagc 

ttttcagctg 

ttttggaaaa 

tcaaaattgg 

ctataatttt 

tcgattttaa 

taattctttt 

aacatttctc 

tataaaatca 

tttaattatt 

aagtacattt 

taaatgtact 

tgttgaataa 

tgattttagc 

gagaaatttt 

aaaagaatta 

taaaaatcga 

aaaattatag 

gtccaatttt 

tegttttcca 

tagattatta 

aagaaattgt 

ccaaaatggt 

aaagttgtga 

ttggcaactt 

aatgcatgtt 

gcatcgacca 

cattaaagga 

ccaaaatttg 

cttttttttt 

ttgggtctaa^ 

aagtataaat 

aaaaaaaatt 



ttctcaaaaa 
aaatcaaaat 
cgagaaattt 
acaaacaaaa 
tcgatttttt 
cttttttcaa 
ttagacccaa 
aaaaaaaagt 
aatttttttt 
acttttcaat 
atggtcaagt 
taaatcaacg 
ataaaaatta 
tgaaaatggg 
cagaaegtet 
gagctaaaat 
gaaaactaaa 
aaattttgat 
gaggaaaaaa 
attttttgat 
tctaaaaatc 
ttttccatta 
ccaaaactac 
ccgctcaaag 
atcggtgtcg 
ttggcatttc 
aaaaaccatc 
tgataatcaa 
agattttagc 
gagaaatttt 
aaaagcagcg 
tatccaatca 
aattttaatt 



aaaagttttg 
tttttccaac 
tccatttttt 
aaattttttt 
aaataaaact 
aaaaaaattt 
attggtccaa 
tatgaeggtt 
ctaacttctc 
aaatattgtg 
ggggcccaaa 
aataaacgee 
ggttccagac 
ccttattttg 
cattacgaaa 
tgaattttct 
tttggaagaa 
cgattttttc 
aaaactttga 
gtggattttt 
gaaaaaatta 
aaggtggagt 
egaatttegt 
ttttggaaaa 
cagcggttgg 
attatgtgtt 
tcaatcaacg 
ataaaaatta 
taaaaatggt 
cagagegtet 
tctcaaaat.t 
aaaattgacg 
tttgtt aga 
Arg 



10292 
10352 
10412 
10472 
10532 
10592 
10652 
10712 
10772 
10832 
10892 
10952 
11012 
11072 
11132 
11192 
11252 
11312 
11372 
11432 
11492 
11552 
11612 
11672 
11732 
11792 
11852 
11912 
11972 
12032 
12092 
i2152 
12211 
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ttc gga atg tac gtc gaa cca gtg ctg acc gat get tgg cag tgt cgt 12259 
Phe Gly Met Tyr Val Glu Pro Val Leu Thr Asp Ala Trp Gin Cys Arg 
1120 1125 1130 

cca tea teg tct ggt ctt cca tea tat att cgc aac aat tta tea aat 1230.7 
Pro Ser Ser Ser Gly Leu Pro Ser Tyr lie Arg Asn Asn Leu Ser Asn 
1135 1140 1145 

ate gag ctg aat tct cgt tct ctt etc etc aac acc tec act aat ttc 12355 
lie Glu Leu Asn Ser Arg Ser Leu Leu Leu Asn Thr Ser Thr Asn Phe 
1150 . 1155 1160 1165 

gat acc cga atg teg ate tea cgt get ctt caa ttc cca gaa etc cgt 12403 
Asp Thr Arg Met Ser lie Ser Arg Ala Leu Gin Phe Pro Glu Leu Arg 
1170 1175 1180 

ctg ate gag tac gat tgt gga aag ctt cag acg ttg get gtt ctg ctt 12451 
Leu lie Glu Tyr Asp Cys Gly Lys Leu Gin Thr Leu Ala Val Leu Leu 
1185* 1190 1195 

cgt cag'ttgr tac ctg tac aag cac aga tgt ctg ate ttc acg caa atg 12499 
Arg Gin Leu Tyr Leu Tyr Lys His Arg Cys Leu lie Phe Thr Gin Met 
1200 1205 1210 

tea aag atg etc gac gtt ctg cag acc ttc ctt tct cat cac ggt tat 12547 
Ser Lys Met Leu Asp Val Leu Gin Thr Phe Leu Ser His His Gly Tyr 
1215 1220 1225 

cag tat ttc cgc etc gac ggt acc act ggt gtc gaa caa aga cag gcg 12595 
Gin Tyr Phe Arg Leu Asp Gly Thr Thr Gly Val Glu Gin Arg Gin Ala 
1230 1235 1240 1245 

atg atg gag egg ttc aac gcg gat ccc aag gtg ttt tgc ttc att ctg 12643 
Met Met Glu Arg Phe Asn Ala Asp Pro Lys Val Phe Cys Phe He Leu 
1250 1255 1260 

teg acg aga tec ggt ggt gtt gga gtc aat eta acc ggt get gac act 12691 
Ser Thr Arg Ser Gly Gly Val Gly Val Asn Leu Thr Gly Ala Asp Thr 
1265 1270 " 1275 

gtg ate ttc tac gat teg gat tgg aat ccg acg atg gat get cag get 12739 
Val He Phe Tyr Asp Ser Asp Trp Asn Pro Thr Met Asp Ala Gin Ala 
1280 1285 1290 

cag gat aga tgt cat cgt ate gga cag acg agg aat gtc teg att tat 12787 
Gin Asp Arg Cys His Arg He Gly Gin Thr Arg Asn Val Ser He Tyr 
1295 1300 1305 

cga ttg att tec gag cga aca. att gag gag aat att ctg aga aag gca 12835 
Arg Leu He Ser Glu Arg Thr He Glu Glu Asn, He Leu Arg Lys Ala 
1310 1315 1320 ~ 1325 

aca cag~a1a'g"i5gg" cga ~i:ttr g£ji ftf^ttt^fca "fef gSc gacf^ct: €tc 12883 
Thr Gin Lys Arg Arg Leu Gly Glu Leu Ala He Asp Glu Ala Gly Phe 
1330 1335 1340 

aca ccc gag ttc ttc aaa caa tct gac agt att egg gat ctt ttt aat 12931 
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Thr Pro Glu Phe' Phe Lys Gin Ser Asp Ser lie Arg Asp Leu Phe Asp 
1345 1350 1355 



gga gag aat gtg gaa gtg act get gtg gca gat gtt gcg acg acg atg 
Gly Glu Asn Val Glu Val .Thr Ala Val Ala Asp Val Ala Thr Thr Met 
1360 ' ' 1365 1370 



12979 



age gag aaa gaa atg gag gtt gcg atg gca aag tgt gaa gat gaa get 
Ser Glu Lys Glu Met Glu Val Ala Met Ala Lys Cys Glu Asp Glu Ala 
1375 ' 1380 ' 1385' 



13027 



gat gtg aat gcg gcg aag att gcg gtg gec gag gcg aac gtt gat aat 13075 
Asp Val Asn Ala Ala Lys lie Ala Val Ala Glu Ala Asn Val Asp Asn . 
1390 1395 1400 1405 

gcg gag ttt- gat gag aaa tea ttg ccg ccg atg age aat ttg caa gga 13123 
Ala Glu Phe Asp Glu Lys Ser Leu Pro Pro Met Ser Asn Leu Gin Gly 
1410 1415 1420 

gat gag gag get gat gag aag tat atg gag ttg ata caa c aggtaaaatt 13173 
Asp Glu Glu Ala Asp Glii Lys Tyr Met Glu Leu He Gin 
1425 1430 

eggeggaaat eggaaatttt cccatttaga atatcaaatt ttgeccgatt gtgtcgtttt 13233 
ttgatttttc gatttattcg atttgttttt gagggaaaat eggaaaaatg ttcagaaaat 13293 
taaccataac atgtgatctt tttaaaatct tagegcaaat gtcttctaaa aaataaagaa 13353 
tgaccaaaaa ttttaagcta atttttgaaa aaccaaagaa aaaatttaga tttttcgatg 13413 
ttttccgaga- caaaaagaca aaaacggaaa ttgtcgaaaa tgaatgaaat . ttttaatttt 13473 
tcagcaaaaa aaaaatagta cttaatttta aaaaatgtga teattteggt aggaaaatct 13533 
ggaaaaatcg attttcaaac aaaaaaaaac cgagcctcta caatcttttt ttttcccgaa 13593 
atctccagaa cttctcacaa taacaactat ataaatttca aaatttc ag etc aaa 13648 

Gin Leu Lys 
1435 

cca ate gaa cga tat gec att aac ttt ctt gag aca cag tac aag cca 13696 
Pro He Glu Arg Tyr Ala He Asn Phe Leu Glu Thr Gin Tyr Lys Pro 
1440 1445 1450 



gaa ttt gag gaa gaa tgc aaa gag gca g aggtatatta ttccattcat 
Glu Phe Glu Glu Glu Cys Lys Glu Ala 
1455 14 60 



13744 



ctgacttttt tttttttttt ttaaatttaa atttcaccaa attaattac ag get ctt 13801 

Glu Ala Leu 
1465 



ate gac caa aaa cgc gaa gaa tgg gac aaa aat etc aac gat ace gec 
He Asp Gin Lys Arg Glu Glu Trp Asp Lys Asn Leu Asn Asp Thr Ala 
1470 1475 ~ 1480 



13849 



gtc att gac etc gac gat teg gat agt ctg ctg etc aac gat cct teg 13897 
Val He Asp Leu Asp Asp^Ser Asp Ser Leu Leu Leu Asn Asp Pro Ser 
" 1fic ' 1490" " . i4 95~* 



1485 



act tct gee gat ttt tat cag age tea agt ctt tta gac g aggtacgega 
Thr Ser Ala Asp Phe Tyr Gin Ser Ser Ser Leu Leu Asp 
1500 1505 1510 



13947 
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tcgtcgtcgt cgcagcagca gccttctcca aaaagccgct caaaaaccgg caaaaaagcc 1400? 
tcaaaacttc caaattcgtg ctcgctcccc gtctaagcgt aaatctcagg ctccttcctt 14067 
cgatccatat gtttcgtacg caccgcacgc gctcgcttct cccccggatt ccccgcgtaa 14127 
gagaagatca cgtggtgcgc gtagtttagg tagtggtggt ggtggtggtg gtggtagtag 14187 
atctgttgga agacctgccc gccgatcagt gaagaaagaa gaatcagatg atgatgatga 1424T 
qgattattgc caagaagagg aagtgaagcg , aaatccggca gaaaaggtcc cgccgaaaag 14307 
aaaacgagtt gtgtttgtgg aacctccaga ggtgaagccg ccggagccga aaaaacgagt 14367 
tgttgttcct gctccatcat catcatcatc agctctaact actcttccac aacaaggacc 14427 
gctgatttcg ttgccaaaag ctgtgccagt tgtacctcgg ccccaacaac aagcaccacc 14487 
acagctcatc aaaaagcacc agcagactct gatgcctgtg aaggtgctca agattagtgg 14547 
tggtggtggt ggtactccag gaccatccag tgtatcgcca ggtccatcaa tcctccgaag 14607 
aaccgttgtt ccaggcatag gcgctggtgg tgttggacgc ctaccgcttg tcagaatgcc 14667 
tgttcgccct ccatttcctg gctcgcaagc tcctgctcca ccgctgagaa gtggtgttgc 14727 
tccaacagct cctgcagcag ctccacgcca gttcgtcgtt ccgtcgtcga gagttcgagt 14787 
tatcacgacg agaactccgg tcgccaccac catggtgcaa caacaacaaa gcccgagccc 14847 
gttgatgttt ccagtccggg ttgtgcaaag gcccgggcca tctggaccac caccacctgg 14907 
acctccagat cgcccaggat ttggaatcta tgagaagccg agattctcac ttggatcacg 14967 
aagaagccgt ggagattcgg gcccggaaga tccggcgcca ccacagccac caccacccac 15027 
cajct.tctagg^ccaccgccac aagcctaggc gctaggattt tccttttttt tttgttgatt 15087 
tttgctcttt'ttttgctctc tcatgatttt ataatctcat tttgctttaa tatttccatt 15147 
tttttggatg tgtggaattt ttttttttga aaatcgggaa aaaacgaaaa atttgaactt 15207 
tttggtgatt ttcagagaaa aatccgtttt taaatgaaaa aatcggaata attcagattt 15267 
ttcgaaaaaa aaaaccgaga aaatttcaaa ttttcagttt tttttttcaa aaaatcgaaa 15327 
aaaaaagtaa attttcagaa ttatcagcca agtttttgcg attttttgaa aaatttcaat 15387 
ttttggcaat ttttgggaaa aaatcaattt ttaattcaga aaattggaaa aattaagatt 15447 
tttcgaaaaa aaaaacgaag aaagtttcaa atttttagct tttttcaaaa aatcgaaaat 15507 
cggaattttt ttaatttttc gaataaaaaa aatcgaagaa attccaaaac tttgcgtttt 1S567 
ttcttgaaat tatctgaaaa ccggaatttt ttttcaaaat tcg'ccatttt ttgcgaattt 15627 
ttgtaatctt tttccgagaa aactcgattt tttaaatctt aataattcag atttttcgat 156Q7 
tttcttttgt tccaaaaagt caaaaaccga acaattattt atttcaaaaa ctctaaaaat 15747 
tttcaatttt ttggaaattt tcgggtataa aaaaaaccca tttttaaatc aaaaaatcgg 15807 
aaatttttgt gatttttcga tttttttcac tccaaaaaaa ttccacacag caaaaaataa 15867 
actccgcgca tttttgagcg cacctttcaa tgttttaatt cttatcacga cgtcaaaatt 15927 
cggttatttt tcacacacac acattttcct cccgagcggt tctttttttc atgagttctc 15987 
ccatgttttg tttttatatt tgagacattt ttttttgttg ataagtttca acttcttctt 16047 
cttcttctga ctataaacgt ttttctccat gttttttgcc tgttttctgc cgattttttg 16107 
acacccaaaa ttttttttca ttttcgctcg aaaatgcacg tcgttggctc tagctttggc 16167 
aagtttttaa cactgatttt ctggtttttt tttttttttg cagaattttt cagagatagg 16227 
gggctcattc cagcagggtt tcccactata tttcgcattt tttccaaaaa tttttgtatt 16287 
ttcaaaaatt tccaaaaaga aaggggtttt ctttaccaaa tttttctcgc cacttttggc 16347 
ttaattttgg ctttagagat tcgatcgaaa aaattgcgaa agtggcgaga aatctcactg 16407 
gtttgatgtt tgacccccta ctatagaaaa tttgaaaaaa aaaaaaaaaa aaaaaaacta 16467 
gacgaaattt gtggaaatct tgctggagtt tgacgagtcg atggtggatt tttcttgaaa 16527 
cgaatgaaac ggtgattttg gatcggagaa atatggcgaa aaatggtgag aaatgacgag 16587 
gaggaggaag aagctgaaaa tctggaggaa caaaaattgt gtggaagtct cgggaagaaa 16647 
ttagaattga aattttaaag tgttctgaga attttttgtg tgaaattttt ttaaatctgt 16707 
agatcaaata tcaaaaaaaa aaatcagaac tattacgtgt ttatccacaa agatgagaaa 16767 
aatcgccata tctggcgcgc aaatgaaccc gcgggaagag acaaaactac tgtagttttt 16827. 
aaccaatttg tgtagattta cgagctattg cgtcatcgaa ttgaatttaa ttttcaggcg 16887 
tttcacacgt ttttatattg aaatttatct atttattgaa tcaatcttaa aagaaaacac 16947 
aaaaaat.ttt .ttttaaaaat t^cg^ctcaa aattaaattc aattcgatga cgcaatagct 17007 
cgtaaatcta ' cacaaat tgg- 1 taaaaacta cagtagtttt* gtctctfeccc" gcgggttcatf 17067 
ttgcgcgcca gatatggtga tttttctcat ctctggataa acacgtaata acatttctcg 17127 
gcacaataaa tttttgctga aacaagtgcg cgcctttgaa gagtactgca atttcaaaca 17187 
cggttttttg gttggaaagc acagtacttt ttcaaaggtg cacaccttct cgaatttctc 17247 
ttcgtgtcga gaccaagaat gccatttttc gatttttaaa aaatcaaaaa aaaaattacc 17307 
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tttttaaagg .tggagtaccg aaatttgaga ctttgttttt ttcggcccaa aatggtccaa 17367 
aactaccgaa tttcgtaatg agacgttctg aaaatttctc aaaaaacaac gttatggcgg 17427 
tttaaagttc agcaaaataa ggcccatttt cagctaaaat caaaattttt tcccagcttc 174B7 
tcggtgtcac aacgcctgga acctaatttt tatttattca tcactttttg ataaatattg 17547 
tggtctttta ttaggcgttt attttattga tttaagctta tttatggtct ttgtggcgtt 17607 
acattttgta ccctaaaaac cataaatgta cttaaatcaa cgaataaacg cctaatcaaa 17667. 
ggctacaata tttagtagaa agtgataaat aaataaaaat taggttccag acgttgcgac 17727 
accgagaagt tggcgaaaac tttgatttta gctaaaaata agccattttc ccaaaacttt 17787 
gagcggtcat aacttttttt tga&aaagaa attttcagaa tgtctcatta cgaaattcgg 17 847 
tagctttggg ccattttggg* ccgaaaaagc aaagtctcaa atttcagcac tccaacttta 17907 
gcctttacct tggtgaaatt ttttaatctg tagtatactt ta'tttttggc cgactttttg 17967 
aacacaaatt cggtgttagt ttaaaaaaac aatcaaaact aacatattat ccagacgcga 18027 
aatttttgtc ggttttcttc gcgccaaaaa gtacggtaac aggtttcggc acgatacatt 18087 
tttgttaaaa ggtgctgctc ctttgaagag tgtctaataa ttttcaactt tcgtttctgt 18147 
tggaattttc ttcaattttt catagatgtt ttcgatgaaa caaaaaatta acacaaaatc 18207 
gtcgtgtcga gacccgaaaa aattttgcgt ctgtgcaaca aacccggaaa attaaagtag 18267 
catattgatc caaattgccg atttgccgga aattttgatt ttcggcaata taccgatttg 18327 
ccggaacatt tgattttctg gaatataccg atttgccgga atttttggtt ttcggaaatt 18387 
tgccggaaat ttagaattcc ggcaatatgc cgatttgccg gaaattttga ttttcggcaa 18447 
tatgccgatt tgccggaaat tttgattttc ggcaatatac cgatttgccg gaacatttga 18507 
tttccggcaa tatgccgatt- tgccggaatt tttgatttcc ggcaatatgc cgatttgccg 18567 
gaaattttga ttttcggcaa tataccgatt tgccggaaca tttggtttcc ggcaatatgc 18627 
cgatttgccg gaatttttgg ttttcggaaa tttgccggaa atttagaatt ccggcaattt 18687 
gccgatttgc cggaaatttt .gatttccggc aatatgccga tttgccggaa attttggttt 18747 
tcggaaattt gccggaaatt tagaattccg gcaatatgcc gatttgccgg aaattttgat 18807 
ttccggcaat atgccgattt gtcagaagaa atcgtttgtc acccacacgt gtattgattt 18867. 
gatttttct ag.ata aaa ttc tac gac gag ctg gac gat ate atg cca ate 18917 
Glu lie Lys Phe Tyr Asp Glu Leu Asp Asp He Met Pro He 
1515 1520 



tgg ctt cca cca tea cca cca gat teg gat gcg gat ttc gac ttg aga 18965 
Trp Leu Pro Pro Ser Pro Pro Asp Ser Asp Ala Asp Phe Asp Leu Arg 
15 25 1530 1535 1540 



atg gaa gat gat tgt etc gat ctg atg tat gaa att gaa caa atg aac 
Met Glu Asp Asp Cys Leu Asp Leu Met Tyr Glu He Glu Gin Met Asn 
1545 1550 1555 



19013 



gag get cgc eta cca caa gtt tgt cat gaa atg aga cgt ccg ttg get 19061 
Glu Ala Arg Leu Pro Gin Val Cys His Glu Met Arg Arg Pro Leu Ala 
1560 1565 1570 



gaa aaa cag cag aaa cag aac acg ttg aat gcgttt aa tggtaatatt 19109 
Glu Lys Gin Gin Lys Gin Asn Thr Leu Asn Ala Phe Lys 
1575 1580 1585 

ttcaaaaaaa aatttttttg aaaaaattca attaaattcg attttgagca atttttatcg 19169 
tgaagattgc ataattttga gattttgege caagattttt gttaaattga aaaaaagaga 19229 
tgtgcgcctt tatggagtac tgtagttttg aaaattgaaa ttacagtact ctgtttaaag 19289 
gcgcacacat gtattacgta gcgaaaagaa aagtacagta attagttaaa taagactact 19349 
gtagcgcttg tgtcgattta egggctctga attttatatg aatttttgaa aactagaaac 19409 
atctcaaatt gcataaaatt accatttgaa cctcccgcca agtgattttg ttcgacgggg 19469 
cgcgcttgca cgttttctat tttaatttaa ttcaattttt' "trtgett aat" ' tctcaccgat 19529 
ttttcatgtt ttcagtttga ttttgatgga aatttggaga caatatcaac ataaatgett 19589 
ttcaatcgaa aatgtgcatt tatattgaca ttttctccga atttccatca aaattaaact 19649 
gaaaacacga aaaatcggtg agaattaagc gaaaaaattg agttaaatga aaatagaaaa 19709 
cgtgcaagcg cgctccatcg aacaaaatca attggcggga ggttcaaatg ggaattgtat 19769 
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gcaattttca aaaggtcgta taaaattttg aagaaagcaa attaaattta aaaaatcgag 19829 
ctcgtaaatc gacacaggcg ctaattttca aaaaaataaa atgacaccca aaaaatcata 19889 
agaaaatcat aaataaatat tacgggaaca caaaactcag agaacccgta ttgcacaaca 19949 
tatttgacgc gcaaaatatg aaatatctcg tagcgaaaag aaaactaccg taatttaaaa 20009 
acatttaaat gactactgta gcgcttgtgt cgatttacga gatctcgatt ttctaaataa 20069 
attttttaaa aaatgatgtc agcgatattc catttgactt tgtttcttcg tattattttc 20129. 
tcatttttgc ttgattttat ttaattttat aattttattt aaaatcaagc aaaaacgaga 20189 
aaataatacg aagaaacgga gttaaatgga atatcgctga cataatttaa aaaaaaaatt 20249 
taattagaaa atcgagatcc cgtaaatcga cacaagtagt catagtacag tagtcattta 20309 
actaattact gtacttttct tttcgctgcg agatatttca tatttttatt catattttta 20369 
tttattttca tatttttata tatatatata tatatatatt tcttggcgtt ctaatgcagt 20429 
ttctctcaat taattcc a gac att eta teg gca aaa gaa aag gaa teg gtg 204 80 

Asp lie Leu Ser Ala Lys Glu Lys Glu Ser Val 
1590 1595 



tac gat gcg gtc aac aag tgc ctt caa atg cca caa tec gaa gcg ate 20528 
Tyr Asp Ala Val Asn Lys Cys Leu Gin Met Pro Gin Ser Glu Ala lie 
1600 1605 1610 

aca gca gaa tct gca gcg tct cca gca tac acg gaa cac tea tea ttc 20576 
"Thr Ala Glu Ser Ala Ala Ser Pro Ala Tyr Thr Glu His Ser Ser Phe 
1615 1620 1625 

teg atg gat gat aca age cag gat gcg aag att gag cca agt ttg act 20624 
Ser Met Asp Asp Thr Ser Gin Asp Ala Lys He Glu Pro Ser Leu Thr 
1630 1635 1640 

gaa aat caa caa ccc ace ace acc gec act act act act aca gta ccc 20672 
Glu Asn Gin Gin Pro Thr Thr Thr Ala Thr Thr Thr Thr Thr Val Pro 
1645 1650 1655 1660 

caa caa caa caa caa cag cag cag caa aaa teg teg aaa aag aag aga 20720 
Gin Gin Gin Gin Gin Gin Gin Gin Gin Lys Ser Ser Lys Lys Lys Arg 
1665 1670 1675 

aat gat aat cga a eggtaeggag gttactagcg aacaatttca agaaattttg 20773 
Asn Asp Asn Arg 
1680 



aatttgtgaa aattcaattc eggcaatttt tegatttgee ggaactttta attttcgecg 20833 
aattgtcaat ttgccggaaa ttttgatttc cgecgaattg. tegatttgee ggaacttttc 20893 
atttteggea aattttcgat ttgeeggaac ttttaatttt tgacaaattg tcgatgtgcc 20953 
ggaaattttg attttcgaca atttgetgat ttgccggaaa tttcaatccc aacaattttc 21013 
egatttgecg gaaatttcaa tcccaacaat tttccgattt geeggaaatt tcaatcccaa 21073 
caattttccg atttgeegga aatttcaatc ccaacaattt tecgatttge eggaaattte 21133 
aatcccagca attttccgat ttgccggaaa tttcaattcc ggcaattttt egatttgecg 21193 
gaacttttca ttttcggcaa agtgtcgatt tgeeggaact tttcattttc gecgaattgt 21253 
cgatttgccc gaacttttaa tttttgacaa attgtcgttt tgctggaaat tttgattttc 21313 
gaeaatttge caatttgecg gaacttttaa tttttgacaa attgtcgatt tgccggaaat 21373 
tttgattttc gaeaatttge caatttgecg gaacttttca tttttgccaa attgtcgatt 21433 
tgccggaaat tttaattccg geaattttge gatttgeegg aaatttcaat teeggcaatt 21493 
taaaaacact aaaaaccaaa aattttcggt tttcccgttt ttcgatgttt cagcttttct 21553 
^caa^aaattg_.cgattccccg--aBaaatcgaa aeaattttcg -gggtHaaaac egggaaattr 21613 
ctaaattcct atttaaaaga attgaaaaaa aactctcaaa attcc ag get caa aat 21669 

Lys Ala Gin Asn 
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cga aca get gaa aat ggt gtg aaa cga gcg aca act cca cca cca tea 21717 
Arg Thr Ala Glu Asn Gly Val Lys Arg Ala Thr Thr Pro Pro Pro Ser 
1685 1690 1695 1700 

tgg cgt gaa gag cca gat tat gat gga gec gaa tgg aat ata gtt gaa 21765* 
Trp Arg Glu Glu Pro Asp Tyr Asp Gly Ala Glu Trp Asn lie Val Glu 
1705 1710 1715 



gat tat gca eta ctt caa gca gtt caa gtc gaa ttt gca aat get cat 
Asp Tyr Ala Leu Leu Gin Ala Val Gin Val Glu Phe Ala Asn Ala His 
1720 1725 1 ' 1730 



21613 



tta gtc gaa aaa teg gcg aat gag gga atg gtg ttg aac tgg gaa ttc 
Leu Val Glu Lys Ser Ala Asn Glu Gly Met Val Leu Asn Trp Glu Phe 
1735 1740 1745 



21861 



gtg teg aat gee gtt aat aag cag aca aga ttt ttc cgc teg gec cgt 21909 
Val Ser Asn Ala Val Asn Lys Gin Thr Arg Phe Phe Arg Ser Al'a Arg 
1750 1755 1760 



"caa'tgc tea Tt't 'cga tat caa atg ttt gtt egg cca* aaa gag "etc ~gga 
Gin Cys Ser lie Arg Tyr Gin Met Phe Val Arg Pro Lys Glu Leu Gly 
1765 1770 1775 1780 



21957 



cag ttg gtg get tct gat ccg att tec aag aaa acg atg aaa gtc gac 22005 
Gin Leu Val Ala Ser Asp Pro lie Ser Lys Lys Thr Met Lys Val Asp 
1785 1790 1795 

eta teg cat act gaa tta tct cat ttg aga aaa gga cga atg act acg 22053 
Leu Ser His Thr Glu Leu Ser His Leu Arg Lys Gly Arg Met Thr Thr 
1800 1805 * 1810 

gag age caa tat get cat gat tat gga ata ttg act gat aag aaa cat 22101 
Glu Ser Gin Tyr Ala His Asp Tyr Gly lie Leu Thr Asp Lys Lys His 
1815 1820 1825 

gtg aat aga ttt aaa agt gtt cga gtg gcg gca aca egg aga cct gtt 22149 
Val Asn Arg Phe Lys Ser Val Arg Val Ala Ala Thr Arg Arg Pro Val 
1830 1835 1840 

cag ttt tgg aga ggc cct aaa ggt aga gga gga tgg ctt cat aat agt 22197 
Gin Phe Trp Arg Gly Pro Lys Gly Arg Gly Gly Trp Leu His Asn Ser 
1845 1850 . 1855 1860 

cac tgc aac ttt ttc etc acg agg gac gag aaa aag tgg ttt eta ggc 22245 
His Cys Asn Phe Phe Leu Thr Arg Asp Glu Lys Lys Trp Phe Leu Gly 
1865 1870 1875 



cat ggc cga ggt gec gac aag ttt ca gcggccattt atettgettt 
His Gly Arg Gly Ala Asp Lys Phe 
1880 



22291 



'gttttccgcc 
gataaataaa 
agtgggcatg 
ctaaactttt 
ataacgagac 



~cgt'tt"t"cttt~ 
tattttttgc 
cagcggtgaa 
tctgaaacat 
cgctgaaaaa 



cgtttft'cac 
agatgetaaa 
agtgggcatt 
gatacatgtg 
gttttgaggt 



cgat'ttt'Cfce 
aaaatttcca 
gtaatatgat 
ctgcttaaat 
ttccaaaatt 



cgtfttttcr 
agtaaaaaaa 
ggattacggg 
gctgagacta 
caactttttt 



t aat aaaa "ciT 2 235 1 
tcatgtattc 22411 
tatacaaaac 22471 
cctgattttc 22531 
qataaaaaag 22591 
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tcgagatttt cgcacaaaaa gttgaatttt gaaaacctca aaactttttc agcggtctcg 22651 
ttatgaaaat caggtaattt cagcatctaa gcatcatatg tatcatgttt cagaaaagtt 22711 
taggttttgt attcccgtaa tccatctatt tacattgacc actttcaccg ctgcttgccc 22771 
actgaataca taattttttc acttggaaat tgttttagca tctggaaaaa gtatttattt 22831 
- atcagtttta ataagaaaaa acgggaaaaa gctgtgaaaa acaaaagaaa acaggcggaa 22891 
aacaaagcaa gataaatggc cgctgaaact tgtcggcccc tcggccatgg cctagaaacc 22951 
acttttcttc gtccctcgtg aggaaaaagt tgcagtgata gtctaaaatt cggaggaatt 23011 
ttttaaaatt ggaaaaaatt gttaaatttt ttttttctgg aaattggaaa atcacaaatt 23071 
ttcgattttt gtttgttaaa aaaaaaaaga aaattggcat aataaaacat ttcttttttt 23131 
tttgaaaatt gggaacttct taatatcaga ttttttaagt, aagatttttt tgattttccg 23191 
gaaattcgga aaacctgaaa attttcaaca tttcaaaata aaaatttccg tttttttttt 23251 
ctgaaaatct ccaacaaaaa aaggtcaaat cgtcagaatt attgttggaa gtggcggttt 23311 
ttcacgatta gagttcagta ttttttcttc tgaatttcaa atttgaaaaa aaatcgaata 23371 
aactgtagaa aaatgataga aaattaacaa aaattctgat taaaggtaaa gggaaaatag 23431 
accgtaatga ccgaatataa ctgttgaaaa tatcaacaaa aaaaattctg aattttttgt 23491 
gactttttca atttttcaag aataaaaaaa acgaccgaat aaaatatttg aattcccgcg 23551 
caaatgagtg actggttctg gccaatttac agtcttttta taaaagaaaa aatctagaaa 23611 
aaccggcgaa tttagccaga aaacgcaaaa aattaaaaat gacgtcactc atttgcgcgc 23671 
ggaatacaaa tttaattagg ccgtttcttt gatttttgaa aaattgaaaa aaccattaaa 23731 
aaatttagaa atttttttga attttttaca gttttttatt cggtcattat ggggttattc 23791 
aagtagtgtc ggaaaattaa aaagtgtaga aaaattacgt cacaactctg tattcaagta 23851 
tataaaaaca' tgtatttaaa tacattttgc tacattactt gaataacccc attagggttt 23911 
attttcttta gagcaaaaaa aaacatgttt ggctctactc cacctttaaa tgaaaaaatc 23971 
gacaatttgt gattttgcaa tttccagaaa aaaaagaaaa aagttgcttt ttggaaaaaa 24031 
ccaaaaaaag ccatttgaaa aattttattt tccaaaaaaa attattttgc agctctagaa 24091 
tctcgaaatc tgcaatctct aaacggcgga atgccaccac gacacgagtc gagactcgcc 24151 
gaattcgacg taaaaaccaa tattcgcctg gacgccgagg acattgtcac aatgtccgac 242.11 
gagtcgattg tcgcctatga agcgagcaag aagaagctac tggccagtcg tcaaacaaaa 24271 
ccctcaccac gtcaagatgt ccgattccat acgctggttc ttcggccgta taccgtacct 24331 
gtgacaactg agtactcggc tgcaccttct cgtcgtgaaa tgcgcatcgc tgttccaccg 24391 
cttcagcctt cggctttatc tacgatttcc tcagttgctg ctgctgccac gtctgggcca 24451 
ctaccatcaa ttcagcattt gcagtcgtcg tctacgggct tgggatctca gcaaaatttg 24511 
caaaattcgc ataattctga gcaaagaaat aatgtgcaaa atatgcatca aaatcaatat 24571 
aattcaagtc aaaatccgcc aatacctatc cgacaaatcg gagcagcatc atcacaccaa 24 631 
catgatcaag gatctcaggg gcctggggga aaaccacaag cctatcacct ggtgcaacag 24691 
ggatcacagc aacagcagca gcagcagcag caggcgacgt tacagcgaag aaatgcggcg 24751 
gcggcggcag ggtcgaatgt gcagtttatt cagcagcagc agcagcagca gcaatcgggt 24 811 
aaaaattgta tggatttata ggaaattata tgaatttgcg cggggatagc cccggcgaaa 24871 
aacgggaaaa agcgacaatt taaaaaaaaa tcgtgtgaaa atctcaattt tttacaattt 24931 
tgaaagtaat tttttattga aaaaagtgga atttaggcat tcatccagag cagggctggg 24991 
accaaaaaaa atttttggac caaaaaccaa aaaacaaaaa attgaaattt ccgaaaaatc 25051 
aacttaagca tcaaaaattt tttgtttttt tttttgtttt ttggtttttt ttggtatttt 25111 
gacgaaaaaa cgattttttg gttttttggt ttttcgagac caaaaaaacc aaaaaatcca 25171 
aaaaaatgtt tgccgtgtct agtctcgacc tagacacggc aaacattttt tttttttgga 25231 
ttttttggtt tttttggtcc cgaaaaacca aaaaaaccaa aaaatcgatt tttcgtcaaa 25291 
ataccaaaaa aaaacaaaga attcccagcc cctttcgcca aaattgccgg atattttcaa 25351 
acctcaaaaa aaatttataa aggtggacta catcctgtgg ggaaattgct ttaaaacatg 25411 
cctatgggct cacaatgacc gaatatcatg attaaaaaat tcaacaaaaa aattactaga 25471 
ttttatgtga ttttttgaaa attaaaaaaa tctcagtttt caacctaatt cctatttgaa 25531 
tttccgccaa tttgatttgt tcgatggagc gcgcttgctt tatttttttt tattcattga 25591 
• ttttattttt attagcatta tttcactgat tttcttcatt ttttgtgtgt ttttggtggg 25651 
- - - aattga^a tg^aaaaaaaai:a_ag.a.taa J aJtg<r_ a.qaaacitctg. ttaaaaggtc attgaaaatg 25711 
cttaaaacgg caacaagctt gaaatttgta tattttacac agttttacgc attttcaatg 25771 
actttttaac aaactttccg catttatctt gtttttttca gttcaatttc cattaaaaaa 25831 
cacacaaaaa a&aatgaaga aaatcagtga aataaggtta ataaataaaa taaatgaata 25891 
aaaatgatgc aagcgcgctc caacgaacga attcaattgg cggaaattca aatatggaat 25951 
taggtgaaaa ctgagatttt tttttcaatt ttcaaaaaat catataaaat ctagaaccat 26011 
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tttttgaatt ttttaatcat gatattcggt cattgtgacc ccataggcgt gttttaaagc 26071 
aatttcccca cagggtgtag tccacctttg acgaggtttg aaaatgtccg gcaattttgc 26131 
cgaaattgcc ggaaacttga gatttttcag tgaaaaattc caaatttcat gtggaaaact 26191 
gtttttttgt tttttggaaa atgcaacaaa aaaaactatt tggcgcgaaa acgcggatag 26251 
ttttgccaat tttcaaggat tttccgctat ttttaatgtt tttatgccga attttacttt 26311 
aaaaaatcat aattattcgg aaaatgctcg aagagcattt ccaattgtct gtggagcgcg 26371' 
tttgactaat cagataatat tccaggcggt caaggacaaa gcttcgttgt catgggctcg 26431 
cagagctcat caaatgatgg acaaggtgga gcatcgaccg tcggaggagg aggaggagga 26491 
tcacaacagc ctcaccagca gcagcagcag cagccacaac aaagaataca -gtacattcca 26551 
caagttaccg gtagcggaaa taacggtgga ggtggtggaa gaggaggcta cggtagtaca 26-611 
ctggtcatgc caagaggagg acgtgttgtc agggttggt'a gaaatacaaa atcgcgaaaa 26671 
aacggcattt ccggcttccc gaccaatcag cgatttgctc cgcccacttt cggaccaatc 26731 
cgctgaccga ggcatttgat . tggtttgaaa ttgggcggag cagcgaattg ctgatgcgaa 26791 
atacgggaag ttctcatttt gatggaaatt ctgcaaaatt ctttaaaaaa aacaaaatct 26851 
tctcaaattc ggaaaaaatc acaaaggaaa- tcgaagaaaa tcgcgatttt tgattccccg 26911 
accaatcagc- gatttgctcc gcccactttt gaaccaatca gcgttcgagg catttgattg 26971 
gttcaaaact gggcggagca gcgagttgct gattggattt ttcagttttt aaatttttaa 27031 
agcttttttt aacggaaaaa ttcgagaaaa ccatagattt tgatgagaaa tgatgaaaat 27091 
tttcatgaaa aaatggaaaa atgattggaa attaatcaaa aaatcttgaa aaaaaatttt 27151 
ttttcagaga aaatgcttca tttttggctc tgaaacgcct cttttttatt tgtgcctccc 27211 
cgaccaatca gcaatttgct ccgcccactt ttgaaccaat cagcgaccga gcgatccgat 27271 
tggtttgaaa ttgggcggag ctaaaatgat tttaaaaaaa ttcccgattt gtttaatcta 27331 
. gaaatttaga aaaaagaaat atagaaaaaa aatagaaaaa aattaaaaaa aaaaaaacaa 27391 
aaaatcggaa aacgtcggaa aatattacga aaaaaatttt tttaattgat tttjttttcga 27451 
aaaaaactaa aattttaacc aaaaattcaa agaaaaaatt tgtttttgat ttttttttcg 27511 
aaaaaaaaaa aaattttaac caaaaattca aaaaaaaaat gtttttcttg atttttttcc 27571 
aaaaaaacta aaattttgac caaaaattca gcaaaaaaaa aattttttaa ttgatttttt 27631 
tttcgaaaaa aaataaaatt ttaaccaaaa attcaaaaaa aaaatttttt attgactttt 27691 
ttcgaaaaaa actcaaattt taaccaaaaa ttcaaaaaaa aaaatttttt ttttgatttt 27751 
ttccgaaaaa aactaaaatt ttaaccaaaa attcaaaaaa aaaatgtttt tcttgatttt 27811 
tttccaaaaa aactaaaatt ttgaccaaaa attcagcaaa aaaaaaattt tttaattgat 27871 
ttttttttcg aaaaaaaata aaattttaac caaaaattca aaaaaaaaat tttttattga 27931 
cttttttcga aaaaaactca aattttaacc aaaaattcaa aaaaaaaaat tttttttttg 27991 
attttttccg aaaaaaacta aaattttaac caaaaattca aaaaaaaatt ttttattgat 28051 
ttttttccaa aaaaactaaa attttgacca aaaattcagc aaaaaaaaaa ttttttaatt 28111 
gatttttttt cgaaaaaaac taaaattttg accaaaaatt caacaaaaaa aaaatttttt 28171 
attgattttt ttcgaaaaaa actaaaattt tgaccaaaaa ttcaacaaaa aaaaattttt 28231 
ccagccagcg ggaactctac caggcggtgg acgtctgtat gtcgatcata accgtcatcc 28291 
atatccaatg tcgtcaaatg ttgtgccagt acgtgttcta ccagccacgc aacaaggaca 28351* 
acaacgaatg atgacaggac aacgtcgtcc ggctccagcg cccggtactg tcgccgcaat 28411 
ggtgttgccg aatcgaggag ctggtggaat tccgcaaatg cgcagtttgc agtgagtttt 28471 
gcacggaaat tggacgattt tcagcgaaat tttcgggaaa aatggctatt ttgtgtttga 28531 
aattgcgaaa tttcacgatt tcgtcttaaa tacggtgcca acctacccca tgacggtttg 28591 
atctacaaaa aacgcgggaa tttttcacac aaaaatatgt gagacgtctg cacgttctta 28651 
accaatcggt tgaaaactct gccgcatttt tgtagatcta cggtagatca ctgcagattt 28711 
taagagagaa aaataaataa ataatcccac aaggttttta aaattttttt ttcaatcgta 28771 
aaaaatagcg aaaaattgtt tttcgcgtcg agaccctacg cacatttttt tgcaattttc 28831 
gcttcaaaat tacggtaccg ggtctcgaca cgacattttt attgtgtaaa atacacaatt 28891 
ttttggaatt ttcatcgatt cgaatttaaa tatttttaaa tgatttaatt aattcttaac 28951 
gaaaaaaaaa aagttcgaaa ctgcagtact ctttaaaggc gcacacatgt atgtatttat 29011 
aaaaaatgtc gtgtcaagac cgtacttttg gctcacaaat tgcaaaatat tgcggaattt 2907.1 
tttttaattt tagataaaa a aaa acatgaa aaatctatgg aaactaaact tataatttaa 29131 
aa a aaa a t tt ~t 1 1 € aaggt* g- gact a cgc"tc ag t ggggaaa"*Ff gcYEEaaa* aca^^ccTaE~"2 9191 " 
■gaggccccaa tgactgaata tcatgattaa aacaatcaaa aaaaattttc tagattttat 29251 
atgatttttt gaaaattgga aaaatcacag ttttcaccta attctttttg aatttccgcc 29311 
aattggatta gttcggtgga gcgcgcttac attattttta attatttatt ttatttattc 29371 
tcgttatttg actgattttc ttcatttttt gtgtgttttc ctcggaaaaa ggaagaaata 29431- 
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aacaagacaa atgcaaaatg tttgttaaaa agtaattgaa aatgcgtaaa actttgatat 29491 
tctgagttcc gacgacaaca agcctgaaat tagtatattt cacagttttt ctcattttca 29551 
attacttttt aacaaacatt ttgcatttgt cttgtgtatt tcttccattt tccgaggaaa 29611 
aaacatagaa aatgaagaaa atcggtcaaa taacgagaat aaataaaatt aattttaaaa 29671 
aagatgcaag tgcgctccac cgaacaaatc caattggcgg aaattcaaat atggaattag 29731 
gggaaaactg tgatttttcc cattttcaaa aaatcatata aaatttggaa aatttttttg 29791. 
aattttttta atcatgatat tcggtcattg. gcgccccata ggcgtgtttt aaagcaattt 29851 
ccccactgag cgtjagtccac atttaatttt ccaaaacagc acatgctaat cctccaagtt 29911 
attccagacg aggcagttac accggcggtg gtggtcagca acgaatcaac gtgatggttc 29971 
aaccacaaca aatgcgcagc aacaatggcg gtggagtcgg tggccaagga ggcctccagg 30031 
gtggtccagg aggtccgcaa ggaattcgtc ggccactcgt cggacggcca ctacaacgag 30091 
gagtcgataa tcaggcgccg acggttgctc aggtcgttgt tgctccgccg caaggaatgc 30151 
agcaggcatc acaaggacca cccgtacttc atatgcagag agcggtttcc atgcaaatgc 30211 
cgacgagtca tcatcatcaa ggccaacagc aggctcctcc gcagagctca cagcaggctt 30271 
cgcaacaggc tcccacatcg gattctggga cgagtgctcc gccacgacaa gcaccaccac 30331 
cacaaaacta gaattttccc ctattatcct attttacccc ccaaaactct attaattaaa 30391 
taatttcctt cctatttttt tcttcgtgtg aagattattt gtcccccaac caagggtgtc 30451 
ggtttttcga tttttcgacg tttttcaaaa aaatttcgat ttttcgaaaa attagcttca 30511 
tattttggct attactctgc tttttagaag aaatttgtat gttttttctt gaaaatataa 30571 
gcaaaattag atttaaaaaa aatcatattt tatggttaat tttctgaaca tatttttcaa 30631 
ttttcgattt tcacagaaaa acatcgaaga atcgacaaaa tcgaaaaata tgttccgaaa 30691 
attaaccata aaatatgatt ttttttaaaa tctaattgtg attatattta taagaaaaaa 30751 
catacaaatt tcttctaaaa agcagagtaa tagccaaaat atigaagctaa tttttgaaaa 30811 
aacgaaaaat tttcgatttt ccaaagaatc gaaaaatcga aaaatgacac ccttgccccc 30871 
aactatctct gtatattatt catctattat tgattgtttc tttttgttcc tcgaaatttt 30931 
ttgaaattaa agttctcttc cccaccccga tttccgttgc tttattaatc gcgattgatt 30991 
aattgttttt ccataaatcc ccaactattt atctctgtat attattcatt tatattattt 31051 
atcttttatc tgtgtcgatt tacggtatct ccgggccgta tgattttgaa ttctcttctc 31111 
aaataaaatt gtttt'tcatc taacatttga tacgtgtttt tctgattttt ttgtatatat 31171 
attttccatg tatatatttc tttttctttt ttctttgctc caactttatt ttaaataatg 31231 
cttttttatc aagagatttt ttaaaaaatc gatttttttt aaagccagga attctgaaga 31291 
atcgaaaaaa atggaactat ttttcaaata atgagaaagt tttttttttt caagaaaaaa 31351 
ataataaaat tctgattttt ttaataaaaa tttaataagt ttttgaagat tttcattgaa 31411 
aacatctaaa ctattcgatt tttgatttta aattttgaaa ata'gaatttt ttaatatatt 31471 
tttttcaaat cgttaaaaag agaatgccgg aattttttaa aaattcttta aatttagaaa 31531 
taatcggaaa attttcgatt ataaaacgct gtataaaacg aaaaaaagtg gattttgatg 31591 
aaagaaaaaa ttttcttgta gtttttttca gaaaaaaatt actttttatt ctccattttt 31651 
tgttgttgaa tttttgagaa aaaactcatt ttgaaaaaat cgaatttttt atattttttc 31711 
taatcgtaaa aaaaatttta aaaatgaatt ccggtaattt tttaaaaaat aatattaatc 31771 
tatagttttg tagttaaaaa aatgtttcac ataaaaatct aaaaattttt gattttaaat 31831 
taaaaaaaaa tcgaattttt taaaattttt ttcaaatcgt aaaaaaagaa acaataaaca 31891 
aaagaattcc ggaaaaaaat tatattatga ttataaattt atagttcttt acttttttaa 31951 
aagattttat tttaaaaatt ctaaaatgat cgatttttgg ttttttaaaa taatcaaaaa 32011 
tgtttgattt tttttaaacg tgaaaaaaat gcaaagaaaa tgaaatccgg caaaaattgt 32071 
aatataatta taaatctata cttttgtggt tttttccaat atttctataa attcttgatt 32131 
tttaaaataa tcaaaagttt tgattttaaa ttttggaaaa attgaatttt tgtatatttt 32191 
tctaatcgjta aaaaaatttt taaaaaaatc gaaagcggat ttttttctgc tattttgttt 32251 
tttttttgaa aaccggaaaa aataccaaaa attgatagtt tcgaccactc tggctagact 32311 
accaaaattg aatttttttt ttcgaattga gaatggccgt ggtctcatca gtagctagcc 32371 
attctctttt tatttcaatt tttaagaaaa aagtctctaa aattttgaaa aaatcgattt 32431 
tttttactta ctttgatact ttttttatat cttttcaaat cttaaaaaac aattttaaaa 32491 
attgaattcc ggaaattttt-r-tt.aaataata taaatctata gttttttagt ttttaaaaaa 32551 
tatattttta- tBaaaBtcta-aaa-agtrtcgg cttttgacttr fctgaaata^t' cg^aaatgtt" 32611 
tgttttaaat tttgaaaaaa tataaaaaat tcgatttttt caagataaaa aagcgaattt 32671 
tttgaatttt tttcaaatcg taaaaaatgt ctgtagtttt tttaaagact ctcataaaaa 32731 
tctgaaatgt tcgatttttt atttttaaaa taattttaaa aaaattttaa tattttttat 32791 
cgtgcgaatt ttttaccaac tataatttgg aataattttc aqqatctcaa aatatcccac 32851 
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aatcgcgcaa atatgccagg aagcaatgaa gattggataa agaaggaggt cgaggaccag 32*911 " 
gacaccaacg ccaacagctc gagctccagc atagccgtct .cgcgtcagct cgaagggaat 32971 
tctgctgttc ctgacgccat cgaccttctg tcttctcaaa tcaaaagaga agttgaagag 33031 
gaggatgatc gcaacgatga gactggaccc cgttcggagc ccgtggatgt taagccgtct 33091 
ccaaaacgcc caacgaagag gtcagccgag acctggacga cggctcggcg ccaagcaaga 33151 
aacggtctac ggcgggagac ggttcaactc atcgattcgc gtatgtgaat gttggagtcc 33211' 
gccatccata cgatccacgc catcttgtca tggaaacttc attgaatgaa attaggtaag 33271 
gaattattga aaataattat tatatatcca ttttaattca attttttttt tcagaatcga 33331 
agatttcgaa ataatccagt atcttccgat gcccttcagg acttcga.ttc ccatgaagct 33391 
agtgatcttc gcagtgagaa gtgaagaatc tgccgagaag atccgctcgt taatcgatcc 33451 
ttcgatcftgg atcgcggctt ttggtggcgg aaccgaaact caaaaattct tgtggagcga 33511 
gctgacggtg gaggatttcg tcaaggcaca cataatggcc agcaggtaag ctttcgaaca 33571 
tacttaattt tttaaaaact aaaattcagc gcaaccgatg acgtgccata tgaggcagcc 33 631 
atggcggatc gagaatcgct caaacaagct gtaaatgatg ccagctctct gaaaggcttg 33691 
aaggaggtaa taatttagaa atgacagaaa atgaaccgtg atgacgaaat acatctgtaa 33751 
aaaaattata aaaaattcta agctccgttt ttaatttttt ttttcagtta tattctgtca 33811 
tagcggccta tttctctgga aaaaaaaatc caaaatagcc tcaaattcgg aattatgctt 33871 
cgattttttt tctgcggtag tcctgaattt aagacgattt tgaatttttg tagctgcctt 33931 
tcgccacaat tacgttaaac atttcagagc atgtcgaaag ctggatggag gatcgtgagt 33991 
aagatgcgga aagatctcaa tggagcctga tgatcccctt cccagcacac aagacagttt 34 051 
taattttgtg tctgtatagt tttatattaa gttttgatga taatgaattt ttttacggtt 34111 
ttatccatca cttggctcga ttgaagctcc tattgtgcag cacacacggc gtgtaaatta 34-1-71 
gtgcatctaa cctaggaaat gcgatttcta ggccatggcc gaggatccga ctagatcttt 34231 
tttgatggtg tttgtacaga gttaaatttc attttggagg gaaattgaag gaaattgaaa 34291 
gagaaattaa tttaataata ttaatttgat ttaaatgacc agaacaaaac aaataaactg 34351 
aatgacaagc caatcgatat tcgtccagac tgggatgatg ttatatgaac tctttcacct 34411 
gaaacattta agttttttta ataaaagagc aagcgcgctc aaacgcgaaa acgctcgatc 34471 
cacttaatct ggattttgtg ccgattcatt tatttcaagc tatgctcgtt tttttctgtt 34531 
atgtttcatt aaaaagaccg aaaacataac aaaaagtgcc tgaaaacgaa aaaaaaccgg 34591 
cgacattaat tgaaaaattc aaaactacaa tttcgccgcc aaaacccaac gagacccaaa 34651 
gtttcagcgc ggagcgtttc cacttggccg tggagcgcgc ttgtatataa aaggacttaa 34711 
ttttttaaaa tacttaccgc agttacttcc aatgtatgtc aaattcactc gattctccat 34771 
tgcagggtta ctaaaatatg ctccaaatag. .ttcjgcaaggc gttgacttga ataaatcggg 34831 
atggttatct tggatgattg cagttcgatt tccttttgta attatgttct aaaaagtcat 34 891 
tgtaatcatt taaaagtgga gtagcgccag tggggatttt gtctaaatgc acttattatg 34951 
atccaaaaca accgaatatc atcataaaac actccaaaaa gtttagtttt ttcataattt 35011 
cctgtcaaag ttttggcaaa ttggcaaaat tttgaaaaat gcgagctttt gaggtaattt 35071 
aaggaaatgt cgcatgtttc gacccctaca attatttaat acagataatt taaacaaaat 35131 
taaaacataa aaatgtagaa attttttttg ttttggtcga tttccaaaat tatgagtggc 35191 
aaaaactgag taattgccac tttttgacag taaataaaaa atgttcaaaa ttttttgaaa 35251 
cgttttatca tgatatttgg ccattatggg agcaaatgag tggtttatct attttttcac 35311 
tggcgctact ccacctttaa gcatgtctgc ctcaccataa tcccatttaa tccaacgttt 35371 
cttagatttg gattcgaafca tatttgaatg actggaaaat atgttacgtt accattcaat 35431 
gcaccaatat aagtcatttg atcgagaaaa ttcaaatcgg .tgagatttgt gtttctgata 35491 
gtcaatgttc cgaataaaaa ttgtaacact cctaatttgg aaacatattt ttcatcttca 35551 
tggtctatta atagatctcc aaggatatac atacatgtat ctgatagttt gctcattgat 35611 
tcaaatgtgc aataaaatga cgcatccaat ggaccaggat ctttgcaaag tttcgcttca 35671 
atgttttcag tagaaattcc aaggttcaat agggcaacta tctcagtaat ggtgacacaa 35731 
aaatcaggat gaaggttttc aaaattgaag tattgccttt tattgtatgt actgtattgt 35791 
atcatactgg tttgctcaac tgtatctata actttctgaa attttatgtc attattttca 35851 
gaaatcgcac taggcaggca agcctgcctt accgtcagaa ttggcagtcc cagtcgaatc 35911 
atttccggat tatcttgtac attcaatgct acactagcta tatccgagtt atattcgata 35971 
gtttgcaggt tttgtaaaaa cgacaaactc tgtagattag tgttccgaat tgcaatagat 36031 
cctcgaatca ttgtgacatt caaaaatgaa tcataatcga aggttgcajtt aatattcact 3 6091 
aaatttagac cagaatctag agttttgcat ttggagtact ccttaacatt tgatacatta 36151 
actttttcac catcacatcc tgaaatttga ctatttttat actgttaaaa aattgtttct 36211 
caccacaatc ctttaagttc cctctgacaa tgagctcatt atacatgtgt aaaaaqccgc 36271 
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catcacagga aaattccagt ttcggattat tctcgattct aatatcacac gcctcgatac 36331 
cccgatcacg gtacaagtag agatcgtaga gcacactggg gtcgtttaat tgtgaattgt 36391 
ttcggatgta aacaccgtct gaaatctgaa gtttaagaaa aaattaa^ta agttttaatc 36451 
tacatgttga tccgtttttg ttgaaagtat caaaaaatta actggagtca gaatgtctca 36511 
tttcgttttg atcttcaaaa aatgcgggag ttcagaccta gacatctcgt ctgatttcgc 36571 
atggttaaga gcgttctgac gtcacaattt ttctgaaaaa atattcccgc attttttgta 36631' 
gatcaaatta aaatgagaca gcctgacacC' acgtggagtt ccttatatac aaaaaagttg 36691 
atttttcgct cgtgattttt cgttgtaaca tcatgaaaaa tccagtgttc tctgcaaacc 36751 
actaaaatcc acttttttgt ttcagccgct ccgcaagcag cttcgtcgag gtcatggcag 36811 
cggccgagtt tcccactccg ctgaaactcg gcacttaata tatgaacgac taagctagca 36871 
gggccgccat tctaccttac cagcaaaaat gaattcgttc acttacacac atcacacacc 36931 
acattaaagt ttcctttttc tttgtcagct gtaaaaaccg aaaggcttgt cagactagta 36991 
ttctcaatat taaatc 37007 
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ssl-1 Predicted, exons: 



Ex on 


Position in genomic sequence (inclusive*) 


~1 


1001-1281 


2 


1923-2027 


3 


2084-2312 


4 


4420-5205 


5 


5855-6487 


6 


7685-8515 


7 


9700-10184 


8 


12211-13165 


9 


13643-13726 


10 


13796-13939 


11 


18879-19101 


12 


20449-20735 


13 


21661-22273 " 
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Figure 20B 

as 1-1 cDNA 

atgccggcaa caccggtgcg tgcttcaagt actcgaataa gcagacgtac atcatcaaga 60 
tcagtggctg atgatcagcc atcaacttcg tctgcggtgg ctccacctcc ttcacccatt 120 * 
gccatagaaa ctgatgaaga tgcggtagtt gaggaggaga aaaagaagaa aaagacatca 180 . 
gatgatttgg aaattatcac tccaagaact ccagtcgatc ggcgaattcc ctacatttgc 240 
tcgattcttt tgactgaaaa tcgatcgatt cgcgataaat tggttctgag cagcggtcca 300 
gttcgtcaag aagatcacga agaacagatt gctcgagctc aacggataca gccagttgtc 360 
gatcaaattc aacgagtcga gcaaatcata ctcaatggtt cagtggaaga tattctgaaa 420 
gatcctcgat tcgcagtaat ggcagatctc acaaaagaac caccaccaac acctgcacct 480 
cctcctccaa tccagaagac aatgcaaccg attgaggtga aaattgagga ttcagagggc 540 
tcaaatacgg ctcaaccgag tgttctgccc agttgtggag gaggagagac gaatgtggaa' 600 
agagccgcca aaagagaagc gcatgtattg gctcgaatcg ccgagctccg taagaacggc 660 
ttatggtcga acagtcgtct gccaaagtgc gtcgaacctg aacgtaataa aacgcattgg 720 
gattatctac tggaagaggt caaatggatg gcagttgatt tccgaaccga gacgaatacg 780 
aagcgaaaaa tcgccaaagt tatagctcac gccattgcga aacagcaccg cgacaagcag 840 
atcgagattg agagagccgc cgaacgggag atcaaggaga agcgaaaaat gtgtgcagga 900 
atcgcgaaga tggtacggga tttctggtcg tctacggata aagttgtgga tattcgagcg 560 
aaggaagttc tggagtcgag gctcaggaag gcgagaaata agcatttgat gtttgtaatt 1020 
ggacaagtcg atgaaatgag caatattgtg caagaaggac ttgtttcatc gtcgaaatcc 1080 
ccatcaattg catcggatcg agatgataaa gatgaagaat tcaaagcacc tggctctgat 1140 
tcagaatctg acgatgagca gacaattgca aacgcggaaa agtcacagaa aaaggaagat 1200 
gttcgacagg aagttgatgc tcttcaaaac gaggcaactg tggatatgga tgactttttg 1260 
tacactttac cgccggaata tctgaaggct tatggtctga cgcaggagga tttggaggag 1320 
atgaagcgcg agaaattgga ggagcagaag gctcggaagg aagcttgtgg tgataatgag 1380 
gagaaaatgg agattgatga aagcccatca tcagatgctc aaaagccttc cacctcaagc . 1440 
tcagatctca ccgccgagca gcttcaagat ccaacagctg aagacggcaa cggtgatggt 1500 
catggtgtac ttgaaaacgt ggattacgtg aagctcaaca gtcaggatag tgatgaacga 1560 
caacaagagt tggcgaatat cgcagaagaa gcgctgaaat tccagccaaa aggatataca 1620 
cttgagacga cacaagtcaa gacgcccgta ccattcctga ttcgaggaca actgagagaa 1680 
tatcaaatgg ttggattgga ttgQatggtt acactttatg agaagaattt gaatggaatt 1740 
cttgccgacg agatgggcct gggaaagacg attcaaacga tttccctgct ggctcatatg 1800 
gcttgtagtg aatcgatttg gggaccacac ttgattgttg tgccgacgtc tgtcattctg 1860 
aattgggaga tggagttcaa gaaatggtgt ccggctctga agattttgac gtattttggt' 1920 
acggcgaagg agcgtgccga gaagcggaag ggatggatga agccgaattg tttccatgtg 1980. 
tgcatcacat catacaagac ggttactcaa gatattagag cttttaagca gagggcctgg 2040 
cagtacctaa ttctcgatga agctcaaaat atcaaaaact ggaagtccca acgttggcag 2100 
gctcttctga atgtccgtgc tcgacgtcgc cttctcctga ccggaactcc acttcagaac 2160 
tctctaatgg aactgtggtc gttgatgcat tttttgatgc caacaatatt ctcaagtcat 2220 
gatgatttca aggattggtt ctcgaatccg ttgacaggga tgatggaagg aaatatggaa 2280 
ttcaatgctc cactaatcgg acgacttcac aaagtgctcc gtccgtttat tctgcggcgg 2340 
ctcaagaagg aagttgagaa gcagctgcca gagaagactg agcatattgt gaattgttcg 2400 
ttgtcaaagc ggcagagata cctgtacgat gactttatga gtcgtagatc aacaaaggag 2460 
aatctaaagt ctggaaatat gatgtcggtg ctcaacattg tgatgcaact ccgaaaatgt 2520 
tgtaatcatc cgaatctctt cgagccgcgg ccagttgttg ctccgttcgt cgttgagaag 2580 
cttcagctcg atgttccggc tcgtctcttt gaaatttcgc agcaagatcc ctcctcctcc 2640 
tcagctagtc aaattccgga aattttcaat ttatccaaaa tcggctatca atcttccgtt 2700 
cgatctgcaa aaccactcat cgaagagctt gaagcaatga gcacttatcc ggagccacga 2760 
gcaccagaag ttggcggatt tcggttcaat cggacggctt ttgttgcaaa gaatccgcat 2820 
acggaagagt cggaggacga aggtgttatg agaagtcgtg ttctgccaaa accaattaat 2880 
ggaacagctc aaccacttca aas tggaaat^tca^ta.ccac^aaatgctcc aaatcgtcca 2940 
caaacttcat gcattcgttc aaaaaccgtc gtaaatacag ttccactgac catctccacc 3000 
gatcgaagtg gttttcattt taatatggcc aatgttggaa gaggtgttgt tcgtttggat 3060 
gattcagcac gtatgagccc accgctcaaa cgtcagaagc tcaccggaac tgcaacgaat 3120 
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tggagtgatt atgttccgcg acacgttgtt 
ctggaaattg 'ttcgaaggcg atttgagatg 
gttgcgctgg ttcgagagga aattattgca 
gaggttgtgc aggagaggct tttggagtat 
tacgtcgaac cagtgctgac cgatgcttgg 
tcatatattc gcaacaatt]t atcaaatatc 
acctccacta ajtttcgatac ccgaatgtcg 
cgtctgatcg agtacgattg tggaaagctt 
tacctgtaca .agcacagatg tctgatcttc 
cagaccttcc tttctcatca ' cggttatcag 
gaacaaagac aggcgatgat ggagcggttc 
ctgtcgacga gatccggtgg tgttggagtc 
tacgattcgg attggaatcc gacgatggat 
ggacagacga ggaatgtctc gatttatcga 
attctgagaa aggcaacaca gaagcggcga 
ttcacacccg agttcttcaa acaatctgac 
gtggaagtga ctgctgtggc agatgttgcg 
gcgatggcaa agtgtgaaga tgaagctgat 
gcgaacgttg ataatgcgga gtttgatgag 
ggagatgagg aggctgatga gaagtatatg 
cgatatgcca ttaactttct tgagacacag 
gaggcagagg ctcttatcga ccaaaaacgc 
gccgtcattg acctcgacga ttcggatagt 
gatttttatc agagctcaag tcttttagac 
atcatgccaa tctggcttcc accatcacca 
atggaagatg attgtctcga tctgatgtat 
ccacaagttt gtcatgaaat gagacgtccg 
ttgaatgcgt ttaatgacat tctatcggca 
aacaagtgcc ttcaaatgcc acaatccgaa 
gcatacacgg aacactcatc attctcgatg 
ccaagtttga ctgaaaatca acaacccacc 
caacaacaac aacaacagca gcagcaaaaa 
acggctcaaa atcgaacagc tgaaaatggt 
tggcgtgaag agccagatta tgatggagcc 
cttcaagcag ttcaagtcga atttgcaaat 
ggaatggtgt tgaactggga attcgtgtcg 
cgctcggccc gtcaatgctc aattcgatat 
cagttggtgg cttctgatcc gatttccaag 
gaattatctc atttgagaaa aggacgaatg 
ggaatattga ctgataagaa acatgtgaat 
cggagacctg ttcagttttg gagaggccct 
cactgcaact ttttcctcac gagggacgag 
gccgacaagt ttcagc 



gaaaagatgg aagaatcgag aaaaaaccag 3180 
attcgtgctc cgattattcc actggaaatg 3240 
gaatttccac gtttggctgt ggaagaggac 3300 
tgcgagttgt tggtgcaaag attcggaatg 3360 
cagtgtcgtc catcatcgtc tggtcttcca 3420 
gagctgaatt ctcgttctct tctcctcaac 3480 
atctcacgtg ctcttcaatt cccagaactc 3540 
cagacgttgg ctgttctgct tcgtcagttg 3600 
acgcaaatgt caaagatgct cgacgttctg 3660 
tatttccgcc tcgacggtac cactggtgtc 3720 
aacgcggatc ccaaggtgtt ttgcttcatt 3780 
aatctaaccg gtgctgacac tgtgatcttc 3840 
gctcaggctc aggatagatg tcatcgtatc 3900 
ttgatttccg agcgaacaat tgaggagaat 3960 
cttggagagt tggcaattga cgaggctggc 4020 
agtattcggg atctttttga tggagagaat 4080 
acgacgatga gcgagaaaga aatggaggtt- 4140 
gtgaatgcgg cgaagattgc ggtggccgag 4200 
aaatcattgc cgccgatgag caatttgcaa 4260 
gagttgatac aacagctcaa accaatcgaa 4320 
tacaagccag aatttgagga agaatgcaaa 4380 
gaagaatggg- acaaaaatct caacgatacc 4440 
ctgctgctca acgatccttc gacttctgcc 4500 
gagataaaat tctacgacga gctggacgat 4560 
ccagattcgg atgcggattt cgacttgaga 4620 
gaaattgaac aaatgaacga ggctcgccta 4680 
ttggctgaaa aacagcagaa acagaacacg 4740 
aaagaaaagg aatcggtgta cgatgcggtc 4 800 
gcgatcacag cagaatctgc agcgtctcca 4 860 
gatgatacaa gccaggatgc gaagattgag 4 920 
accaccgcca ctactactac tacagtaccc 4 980 
tcgtcgaaaa agaagagaaa tgataatcga 5040 
gtgaaacgag cgacaactcc accaccatca 5100 
gaatggaata tagttgaaga ttatgcacta 5160 
gctcatttag tcgaaaaatc ggcgaatgag 5220 
aatgccgtta ataagcagac aagatttttc 5280 
caaatgtttg ttcggccaaa agagctcgga 5340 
aaaacgatga aagtcgacct atcgcatact 5400 
actacggaga gccaatatgc tcatgattat 5460 
agatttaaaa gtgttcgagt ggcggcaaca 5520 
aaaggtagag gaggatggct tcataatagt 5580 
aaaaagtggt ttctaggcca tggccgaggt 5640 

5656 
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ssl-1 protein 

<400> 3 

Met Pro Ala Thr Pro Val Arg Ala Ser Ser Thr Arg He Ser Arg Arg 

1 5 10 15 

Thr Ser Ser Arg Ser Val Ala Asp Asp Gin Pro Ser Thr Ser Ser Ala 

20 25 30 

Val Ala Pro Pro Pro Ser Pro lie Ala He Glu Thr Asp Glu Asp Ala 

35 40 ■ 45 

Val Val Glu Glu Glu Lys Lys Lys Lys Lys Thr Ser Asp Asp Leu Glu 

50 55 60 

He He Thr Pro Arg Thr Pro Val Asp Arg Arg He Pro Tyr He Cys 
65 70 75 80 

Ser He Leu. Leu Thr Glu Asn Arg Ser He Arg Asp Lys Leu Val Leu 

85 90 95 

Ser Ser Gly Pro Val Arg Gin Glu Asp His Glu Glu Gin He Ala Arg 

100 105 . 110 

Ala Gin Arg He Gin Pro Val Val Asp Gin He Gin Arg Val Glu Gin 

115 120 — 125 

He He Leu Asn Gly Ser Val Glu Asp lie Leu Lys Asp Pro Arg Phe 

130 135 140 

Ala Val Met Ala Asp Leu Thr Lys Glu Pro Pro Pro Thr Pro Ala Pro 
145 150 155 160 

Pro Pro Pro He Gin Lys Thr Met Gin Pro lie Glu Val Lys He Glu 

165 170 175 

Asp Ser Glu Gly Ser Asn Thr Ala Gin Pro Ser Val Leu Pro Ser Cys 

180 185 190 

Gly Gly Gly Glu Thr Asn Val Glu Arg Ala Ala Lys Arg Glu Ala His 

195 200 205 

Val Leu Ala Arg lie Ala Glu Leu Arg Lys Asn Gly Leu Trp Ser Asn 

210 215 220 

Ser Arg Leu Pro Lys Cys Val Glu Pro Glu Arg Asn Lys Thr His Trp 
225 230 235 240 

Asp Tyr Leu Leu Glu Glu Val Lys Trp Met Ala Val Asp Phe Arg Thr 

245 250 255 

Glu Thr Asn Thr Lys Arg Lys lie Ala Lys Val He Ala His Ala He 

260 265 270 

Ala Lys Gin His Arg Asp Lys Gin He Glu lie Glu Arg Ala Ala Glu 

275 .280 285 

Arg Glu He Lys Glu Lys Arg Lys Met Cys Ala Gly He Ala Lys Met 

290 295 300 

Val Arg Asp Phe Trp Ser Ser Thr Asp Lys Val Val Asp lie Arg Ala 
305 310 315 320 

Lys. Glu Val Leu Glu Ser Arg Leu Arg Lys Ala Arg Asn Lys His Leu 

325 330 335 

Met Phe Val He Gly Gin Val Asp Glu Met Ser Asn He Val Gin Glu 

340 345 350 

Gly Leu Val Ser Ser Ser Lys Ser Pro Ser He Ala Ser Asp Arg Asp 

355 360 365 

Asp Lys Asp Glu Glu Phe Lys Ala Pro Gly Ser Asp Ser Glu Ser Asp 

• 370- - - - 375 -- ■* - 380 

Asp Glu Gin Thr lie Ala Asn Ala Glu Lys Ser Gin Lys Lys Glu Asp 
385 390 395 * 400 

Val Arg Gin Glu Val Asp Ala Leu Gin Asn Glu Ala Thr Val Asp Met 
405 410 415 
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Asp Asp Phe Leu" Tyr Thr Leu Pro Pro 
420 425 
Leu Thr Gin Glu Asp Leu Glu Glu Met 

435 440 
Glh Lys Ala Arg Lys Glu Ala Cys Gly 

450 455 
He Asp Glu Ser Pro Ser Ser Asp Ala 
465 470 
Ser Asp Leu Thr Ala Glu Gin Leu Gin 
485 

Asn Gly Asp Gly. His Gly Val Leu Glu 
500 505 
Asn Ser Gin Asp Ser Asp Glu Arg Gin 

515 520 
Glu Glu Ala Leu Lys Phe Gin Pro Lys 

530 535 
Gin Val Lys Thr Pro Val Pro Phe Leu 
545 550 
Tyr Gin Met Val Gly Leu Asp Trp Met 
565 

Leu Asn Gly* He Leu Ala Asp Glu Met 
580 585 
Thr He Ser Leu Leu Ala His Met Ala 

595 600 
Pro His Leu He Val Val Pro, Thr Ser 

610 615 
Glu Phe Lys Lys Trp Cys Pro Ala Leu 
625 630 
Thr Ala Lys Glu Arg Ala Glu Lys Arg 
645 

Cys Phe His Val Cys He Thr Ser Tyr 
660 665 
Arg Ala Phe Lys Gin Arg Ala Trp Gin 

675 680 
Gin Asn He Lys Asn Trp Lys Ser Gin 

690 695 
Val Arg Ala Arg Arg Arg Leu Leu Leu 
705 710 
Ser Leu Met Glu Leu Trp Ser Leu Met 
725 

Phe Ser Ser His Asp Asp Phe Lys Asp 
740 745 
Gly Met Met Glu Gly Asn Met Glu Phe 

755 . 760 

Leu His Lys Val Leu Arg Pro Phe He 

770 775 
Val Glu Lys Gin Leu Pro Glu Lys Thr 
785 790 
Leu Ser Lys Arg Gin Arg Tyr Leu Tyr 
805 . 

Ser Thr Lys Glu Asn Leu Lys Ser Gly 
820 825 
lie Val Met Gin Leu Arg Lys Cys Cys 

835 840 
Pro Arg Pro Val Val Ala Pro Phe Val 

850 855 
Val Pro Ala Arg Leu Phe Glu He Ser 



Glu Tyr Leu Lys Ala Tyr Gly 
430 

Lys Arg Glu Lys Leu Glu Glu 
445 

Asp Asn Glu Glu Lys Met Glu 
460 

Gin Lys Pro Ser Thr Ser Ser 
475 480 
Asp Pro Thr Ala Glu Asp Gly 
490 495 
Asn Val Asp Tyr Val Lys Leu 
510 

Gin Glu Leu Ala Asn He Ala 
525 

Gly Tyr Thr Leu Glu Thr Thr 
540 

He Arg Gly Gin Leu Arg Glu 
555 560 
Val Thr Leu Tyr Glu Lys Asn 
570 575 
Gly Leu Gly "Lys Thr He Gin' 
590 

Cys Ser Glu Ser He Trp Gly 
605 

Val He Leu Asn Trp Glu Met 
620 

Lys He Leu Thr Tyr Phe Gly 
635 640 
Lys Gly - Trp Met Lys Pro Asn 
650 655 
Lys Thr Val Thr Gin Asp He 
670 

Tyr Leu He Leu Asp Glu Ala 
685 

Arg Trp Gin Ala Leu Leu Asn 
700 

Thr Gly Thr Pro Leu Gin Asn 
715 720 
His Phe Leu Met Pro Thr He 
730 735 
Trp Phe Ser Asn Pro Leu Thr 
750 

Asn Ala Pro Leu He Gly Arg 
765 

Leu Arg Arg Leu Lys Lys Glu 
780 

Glu His He Val Asn Cys Ser 
795 800 
Asp Asp Phe Met Ser Arg Arg 
'810 815 
Asn Met Met Ser Val Leu Asn 

^ 830 

Asn His Pro Asn Leu Phe Glu 
845 

Val Glu Lys Leu Gin Leu Asp 
860 

Gin Gin Asp Pro Ser Ser Ser 
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865 870 875 880 

Ser Ala Ser Gin lie Pro Glu lie Phe Asn Leu Ser Lys lie Gly Tyr 

885 890 895 

Gin Ser Ser Val Arg Ser Ala Lys Pro Leu lie Glu Glu Leu Glu Ala 

900 905 910 

Met Ser Thr Tyr Pro Glu Prd Arg Ala Pro Glu Val Gly Gly Phe Arg 

915 . 920 925 

Phe Asn Arg Thr Ala Phe Val Ala Lys Asn Pro His Thr Glu Glu Ser 

930 9*35 940 

Glu Asp Glu Gly Val Met Arg Ser Arg Val Leu Pro Lys Pro lie Asn 
945 950 " 955 960 

Gly Thr Ala Gin Pro Leu Gin Asn Gly Asn Ser lie Pro Gin Asn Ala 

965 970 975 

Pro Asn Arg Pro Gin Thr Ser Cys lie Arg Ser Lys Thr Val Val Asn 

980 985 990 

Thr Val Pro Leu Thr lie Ser Thr Asp Arg Ser Gly Phe His Phe Asn 

995 1000 1005 

Met Ala Asn Val Gly Arg Gly Val Val Arg Leu Asp Asp Ser Ala Arg 

1010 1015 1020 

Met Ser Pro Pro Leu Lys Arg Gin Lys Leu Thr Gly Thr Ala Thr Asn 
1025 1030 1035 "~" 1040 

Trp Ser Asp Tyr Val Pro Arg His Val Val Glu Lys Met Glu Glu Ser 

1045 1050 1055 

Arg Lys Asn Gin Leu Glu lie Val Arg Arg Arg Phe Glu Met lie Arg 

1060 1065 ' 1070 

Ala Pro lie lie Pro Leu Glu Met Val Ala Leu Val Arg Glu Glu lie 

1075 1080 1085 

lie Ala Glu Phe Pro Arg Leu Ala Val Glu Glu Asp Glu Val Val Gin 

1090 1095 1100 

Glu Arg Leu Leu Glu Tyr Cys Glu Leu Leu Val Gin Arg Phe Gly Met 
1105 1110 1115 1120 

Tyr Val Glu Pro Val Leu Thr Asp Ala Trp Gin Cys Arg Pro Ser Ser 

1125 1130 1135 

Ser Gly Leu Pro Ser Tyr He Arg Asn Asn Leu Ser Asn He Glu Leu 

1140 1145 1150 

Asn Ser Arg Ser Leu Leu Leu Asn Thr Ser Thr Asn Phe Asp Thr Arg 

1155 1160 1165 

Met Ser He Ser Arg Ala Leu Gin Phe Pro Glu Leu Arg Leu He Glu 

1170 1175 HBO 

Tyr Asp Cys Gly Lys Leu Gin Thr Leu Ala Val Leu Leu Arg Gin Leu 
1185 1190 1195 ~ 1200 

Tyr Leu Tyr Lys His Arg Cys Leu He Phe Thr Gin Met Ser Lys Met 

1205 1210 1215 

Leu Asp Val Leu Gin Thr Phe Leu Ser His His Gly Tyr Gin Tyr Phe 

1220 1225 1230 

Arg Leu Asp Gly Thr Thr Gly Val Glu Gin Arg Gin Ala Met Met Glu 

1235 1240 ~ 1245 

Arg Phe Asn Ala Asp Pro Lys Val Phe Cys Phe He Leu Ser Thr Arg 

1250 1255 1260 

Ser Gly Gly Val Gly Val Asn Leu Thr Gly Ala Asp Thr Val He Phe 
1265 1270 1275 1280 

Tyr Asp Ser Asp Trp Asn Pro Thr Met Asp Ala Gin Ala Gin Asp Arg 

1285 1290 1295 

Cys His Arg He Gly Gin Thr Arg Asn Val Ser He Tyr Arg Leu He 

1300 1305 * 1310 

Ser Glu Arg Thr He Glu Glu Asn He Leu Arg Lys Ala Thr Gin Lys 
1315 1320 ~ 1325 
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Arg Arg Leu Gly Glu Leu Ala He Asp Glu Ala Gly Phe Thr Pro Glu 

1330 1335 1340 

Phe Phe Lys Gin Ser Asp Ser He Arg Asp Leu Phe Asp Gly Glu Asn 
!34S 1350 1355 1360 

Val Glu Val Thr Ala Val Ala Asp Val Ala Thr Thr Met Ser Glu Lys 

1365' 1370 1375 

Glu Met Glu Val Ala Met Ala Lys Cys Glu Asp Glu Ala Asp Val Asn 

13B0 1385 1390 

Ala Ala Lys He Ala Val Ala Glu Ala Asn Val Asp Asn Ala Glu Phe 

1395 1400 1405 

Asp Glu Lys Ser Leu Pro Pro Met Ser Asn Leu Gin Gly Asp Glu Glu 

1410 1415 1420 

Ala Asp Glu Lys Tyr Met Glu Leu He Gin Gin Leu Lys Pro He Glu 
1425 1430 1435 1440 

Arg Tyr Ala He Asn Phe Leu Glu Thr Gin Tyr Lys Pro Glu Phe Glu 

1445 1450 1455 

Glu Glu Cys Lys Glu Ala Glu Ala Leu He Asp Gin Lys Arg Glu Glu 

1460 1465 1470 

Trp Asp Lys Asn Leu Asn Asp Thr Ala Val He Asp Leu Asp Asp Ser 

1475 1480 1485 

Asp Ser Leu Leu Leu Asn Asp Pro Ser Thr Ser Ala Asp Phe Tyr Gin 

1490 1495 • 1500 

Ser Ser Ser Leu Leu Asp Glu He Lys Phe Tyr Asp Glu Leu Asp Asp 
1505 1510 1515 1520 

He Met Pro He Trp Leu Pro Pro Ser Pro Pro Asp Ser Asp Ala Asp 

1525 1530 1535 

Phe Asp Leu Arg Met Glu Asp Asp Cys Leu Asp Leu Met Tyr Glu He 

1540 1545 1550 

Glu Gin Met Asn Glu Ala Arg Leu Pro Gin Val Cys His Glu Met Arg 

1555 1560 1565 

Arg Pro Leu Ala Glu Lys Gin Gin Lys Gin Asn Thr Leu Asn Ala Phe 

1570 1575 1580 

Asn Asp He Leu Ser Ala Lys Glu Lys Glu Ser Val Tyr Asp Ala Val 
1585 1590 1595 1600 

Asn Lys Cys Leu Gin Met Pro Gin Ser Glu Ala He Thr Ala Glu Ser 

1605 1610 1615 

Ala Ala Ser Pro Ala Tyr Thr Glu His Ser Ser Phe Ser Met Asp Asp 

1620 1625 1630 

Thr Ser Gin Asp Ala Lys He Glu Pro Ser Leu Thr Glu Asn Gin Gin 

1635 1640 1645 

Pro Thr Thr Thr Ala Thr Thr Thr Thr Thr Val Pro Gin Gin Gin Gin 

1650 1655 1660 

Gin Gin Gin Gin Gin Lys Ser Ser Lys Lys Lys Arg Asn Asp Asn Arg 
1665 1670 1675 1680 

Thr Ala Gin Asn Arg Thr Ala Glu Asn Gly Val Lys Arg Ala Thr Thr 

1685 1690 1695 

pro Pro Pro Ser Trp Arg Glu Glu Pro Asp Tyr Asp Gly Ala Glu Trp 

1700 1705 1710 

Asn He Val Glu Asp Tyr Ala Leu Leu Gin Ala Val Gin Val Glu Phe 

1715 1720 1725 

Ala Asn Ala His Leu Val Glu Lys Ser Ala Asn Glu Gly Met Val Leu 

1730 1735 1740 

Asn Trp Glu Phe Val Ser Asn Ala Val Asn Lys Gin Thr Arg Phe Phe 
1745 1750 1755 1760 

Arg Ser Ala Arg Gin Cys Ser lie Arg Tyr Gin Met Phe Val Arg Pro 

1765 1770 1775 

Lys Glu Leu Gly Gin Leu Val Ala Ser Asp Pro He Ser Lys Lys Thr 
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178D 1785 1790 

Met Lys Val Asp Leu Ser His Thr Glu Leu Ser His Leu Arg Lys Gly 

1795 1800 1805 

Arg Met Thr Thr Glu Ser Gin Tyr Ala His Asp Tyr Gly lie Leu Thr 

1810 ,1815 1820 

Asp Lys Lys His Val Asn Arg' Phe Lys Ser Val Arg Val Ala Ala Thr 
1825 1830 1835 1840 

Arg Arg Pro Val Gin Phe Trp Arg Gly Pro Lys Gly Arg Gly Gly Trp 

1845 " 1850 • 1855 

Leu His Asn Ser His Cys Asn Phe Phe Leu Thr Arg Asp Glu Lys Lys 

1860 1865 1870 

Trp Phe Leu Gly His Gly Arg Gly Ala Asp Lys Phe Gin 
1875 I860 1885 
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Figure 23 

Un(n3628) genomic sequence (1 kb of upstream and downstream genomic sequence is 
included in this file). 



GTCAATGGAATTCTCGACGCGGATCTTGTTAGAGATGCCGTCGAGAGAGATT 

TGATCAAATTGCGGTACGCTGAAACGGATGCACCAGTTTTACAGGTAAAATG 

GAAATATACAAACTCAAAAGTAAAATTTTATGAATTTCAGATCAACAACTCA 

CTATACACGGCATCCTGGGAGCAAGATCTCGGAACAAATATGGTTCTGCAGT 

CAAAAGGAAAAGAGATGGAAGTGATTTCGTGTACATCGACCATGATGACTGC 

AGAAAAAGCCCTGTTGACCTCGTTAAGCACCGAAGGATCTACACTAGCCGCC 

AATGCAGAGACTGCTCCGAAATCTGATCTCAGTCGAACTCAACCACGTCAAC 

AATGATTTTCAAAATATAAATTAACATGAAGCTCTGAAATAAACTCATATAA 

CTGCTAAAATAAAACTGTTGCTTTTGAAACCAACATTTGTTAGACAACCTGCG 

TCTCACAGTCATTTTTCAATATATTGGCGCCGCGCACACACAAAGAAGAAGA 

ATTCGTCCTCATGGCATGGCATGTGCAGTCAGCGGCCACCCTGTGTAACCACT 

GCGTATCGCATCTTTCCACGTGTTTTTGCAATCTTGCTGTCACGTTCATTTCCT 

CGTACAACCATCTCTTCTACCCCCGTTGCCTCCTCCACCATCTCATCTCAATTG 

TGTCGTTGCCCTCCCTCTCCCCAAGTCTTTCTGCGTCTCTTAGTGCTCTTCGAG 

AAAAGAACGAGGAGAGCTGTGAGACGCTAGTAGGAAACGCATTCTCAATTC 

GATATAGGCACATTGAGAGAGAGCGAGCGCCGTTTCGACGTCTTCTAGCCTT 

CACATCATCCAGACGACGTTCACACGCACACACAGCCAACCCCACCCTTCTG 

ACAACGAATAGACGACGAAGAAGAGAAGAAGAAAAAGAAGAAGGTACCCA 

TTTTTCATTCCC'lU'rilGCCTCCACACTTCACTATTATCGATTTTGTGAGCGAG 

CTCTAATGTTTCAACGCAAAGTGGTATTGCCTAAAAAGCGGTGAGAATTTGCT 

TCAGACATjAAAtitTGTm 

GAAGGTCAATTTTTACITTCAACGCTCTTCATTGACGGAAAACTCGTTTTTCTT 

TCAAATTTTAAATTACAGAGGCATTTTACTCAAGGTTTGTTTTAATTTAAATT 

AAAAATAAATTTTAAAATAGAAATATGGATAATATAAAATGTTTTCTTCAAA 

AAATGCACTCAGGTTCACCAAAAAATCGATAATTAAAAATACGGTCGCAAAG 

GAGCGTCGTTAGCTGCTAATCAATGGTCTTAAAACGAAATCTATCGATTTTTG 

TGTACTACACACGGACAAGTGCTCCACCGTTATTTTTTGAACGAGTGCGTTGC 
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AATTCCATCCCATTTTGACGTTTTTCTTTTTTTTTTCATCAAATT^ 

TAAAGTAAAGTCAATGATAACCTGCAAATAATAATGTAAAATTCATTAAAAA 

CCGAGAGAAAAAGTCTAAAGTCATAAATTTTTGATAAAAAAGTGATTTTCGA 

AACTAAAAATCATTCAAATTAAAGTTGAACCTGATTCTTCAATTTTTATTATA 

TATTAAAAGCTTGATCCACTCAAATAAAAGGAGTTTTTAATTGAGAAAAAAA 

GCAAATGAAAAAATCGATAATTAAATTGGGCGCCAACCTAGATTTTAATATG 

TTTTTGTTAGAAATTTGTATATTTTCATCACTCTCTGACTTTAAGCATTCGTAT 

TTTAAGGAAGTGTGAGCTTTCTAATATGTTTTTTATTAAAAAAAACATGTTTT 

TAACAATCTCCCTGTCATCCCCATCACCTAATGCACTCAAATAATCAATAATC 

ACAATACTTTTATTTTTTCTTGCAGAACAGAAATGGTCCAAACGAGACGAAA 

GACAGCTGCAGCTGTACAGGACGGTGGTGCCGTTAAGGAGAACAAAGCCAA 

GCCACCTGCCCCTCAAACGCCTACAAAACGAGCAAAACGAGGTCGTCCCCCG 

AAAATTAAGACTGGTGAGCGAATGACTATACGGAAGATTGAAAATTCACGTG 

GAATACTTGCAGATGCCAATACTTTGAATACGCCAAGCACTTCTTCCAACTTG 

GTCGATGACAAACTTCTCATTGAGTCTGAATCACAGGTAAATTGATTCTTTTC 

TATTCAAAAATTAATCTAAACTATACATTCCAGGACTCGATTCTCACAAACGA 

AGCCGACTCTTTTCTGGAAAAAGAAGTGGAAGAAATCGAAGATAGTTCAGAT 

ATACTTCCCGATAAAATTAATTCTCCAGAAAAACCAAGTGTTTTGGTGAAGC 

GGAGATCGAGTACGCGGTTAAAAGTGAAGACTGATGAAGATGAAAAAGATG 

TTCCTGTGAACATAGAAGTAGCCGTTTTAGAAGAAAAATCAATTCAAATCGA 

GCCAACATCTCCCGCTCACCCGGAAGATCCTCAGGTGAGCTTTTTTTAAAAAT 

ATGTATTAATCAAAATTCCTTCATTTCCAGCCTTCGACTTCTTCTCTTCCACTG 

GTAGAACCAATTGAAGACATTGTGGAGCCAAATGAGCCAACAAGCTCTGCCG 

ATCCTCCAGTATCAAATATTAAGGATGAGGATATTAAAGAAGAAGAGCCACT 

GATTAAAAAGCCAGCTTCCGATGAGTCAGAATCTATGGATATAGCTAACTCT 

GAAAGTGGAAATGATTCCGATTCAAGTGAAGCTGATCCTAGGACGATACCAT 

CTTTCTCTATACCrCTTCCCGACACACCACCTCCAAATTTTGCGAAAAGAGGA 

GAAATACATGTAGATGTAGATCAGAAAAATTCCAAGCAATCAGGAGAATCAC 

AATCGCCTTGGGAGCGGTAAGAATATTTATCCTAGCCAGGTGTTATAACAAA 

ATTGAATAGTTTCAGAGCAAGAGAAAAGTCTGCATCGAACCCATTGTCCTCT 

CCAACAATGAGCCGACCCAGGATACACTTCCTTCATCCAGCATATCAAAGTTT 

CACAAATGATTCAGTTTCACCTCTACCACCACCGCCACCAGAGCCGGCTCCA 

GCTCGTGAAAAAGTGGAAAATGGTGGTCCAACTACTTTCAAAATGACTTTCA 

AAAAAGCTGCAAATATTCCTATCTTGAAGACATCGGCATTTGAACAACCATC 

ATCACCTCCACCTTCCTCATCAGTTTCTTCATCAATTTCATTATCTGAAGTGAA 

TTCTTCTACATCGATAGCCTCCGAGTCTTCTCCAGCGAAAAGAAGCTCAAATT 

TCGATTTAACTGCCTCAAATGAGCTTCCACCACCTCAGATGGTTGAACTTCCC 

AAGCTCTCATTTTTCAATATGCCTCCAGCCGTTCGCTCCGCAGAGGTTAGTTA 

ACTTTTTCCCGGTTTCATGAAATTTCAGCGGTATCTGTCCTCCTTTTGGTGTGT 

GCCCTCACAACCTAACCTCTTTTATCCAGGACGATTCTGCGATGACGTCGGAA 

GAACCGATCCTTCTCCTCCGTTCTCCGAATTCCGCCACTCCTGATGATGATGC 

ACmTCCJCACGACCCCACCACCACCCAAGATGACCGAATCAGAAATTCAA 

GCACTGGTGAGCGAGATCACA 

TTCAGACCGTTTTTCTTTACACCTCATCCCCTTTTGTGTTATGTTAACATTCAT 

TTTGTGTCTCAAACACTGCATGCTTTTGCACTTGGAAATTAAAAAATAATGCG 

TTCTGGGATTTTGTGTGTTAAGGTGGAGTAGAGTTTGTGAGGCTAGAAAGTAT 

GCCTTTTTCGTTTCTCCACTGCAAAATTTCGTTTGAAAAAAACAAAAAATTTA 

CTAAAATTTGAAATTTCACCAACTTGCCGTTGTCACAGCTGCTGAAATACAGT 



Page 2 of 6 



WO 2004/024084 



71/92 

FIGURE 23 



PCT/US2003/028626 



TTTTATTGCATTTTCACCCTTTATTGCATATTATTATTAGACACCTTTTAGGTC 

AATAGGCAACCGAAAATATCCGAATTTGACTTAAAATGTACCTAAATTAAGG 

AACTAACTTGAGATATACGACTAAAAATGCAATAAATTGTGAGAATTATTGT 

TATGAAATTCAGCCGTTTTAGGCTAGTTTTAGCCAAAAACCGACAAACTCTAT 

TCCAATTAATTTTCCACTCCTGCACCTCGATTAGTGATTTTTTGAAGAAAAAA 

AATTATCTTCTTATTTCAGAAAGTAGCGACGGAAAAAGTGAATCAAGTAATT 

GCTCGACGTGAAGATTCTGAAAAAGATGTACGTCACAGAGAAGATCGAGATG 

ATTATGATAGACGACGTGACGACCGTGACAGAAGATCCAGAAAGACTGATTC 

GGAACGAAATGATCAAAGAGGACGACAACGTGAAGATGATGAACGAAGAGC 

TCGAGAACGAGAAAGAGAAGTTACGAAACGACATGATCGGGAAAGGGAAGA 

GATGCGATTACAGAAACAAAAAGATGAGGAAAGAAGAAAGAAAGATGAAG . 

AGGAAAGGATACAAAAAGAGAATGATGAGAAAAAACAAAAAGAGGATGAA 

GCCAAAATGGAGGAGGAGAAAAAGAAGATTAAAGAGGAGGAAATGAAGAT 

TCCTGAATTTGAGTTGATTAGCGAATCAAAATATTTGACGAGGAATGCGAAT 

AAAAAGAAGACTGAATCCTTAACGTAAGTTATTATTTATAAATTTGACTTAAA 

AATTGATAACTTTCAAAATTAAGTGATTCAATAGACTCAAAAGAATGAAAAA 

CTAGAGTGCGCCTTTAAAGAGTACTGTAATTTCAAACTTTTGTTGCTGCTCAT 

TTTTCATCGATTTTTCTTAGTTTTTCGTTAAAAATAATTCAACCATTGGATTAA 

AAAAAATTAAAAACACATAAATTTTATTTTGAAAAGTAATGAGAAAAACTAT 

AGAAATTCGCCGAAAATTCTACAGCAACAAAAGCTCAAAATTACAGTACTTT 

TTAAAGG AGCAC ATCTTTCTG AATTTAAC AAAAATTCGG AG ATI Tl T C I TTT T 

TTCGTGlTmCTGGCGAAAAAACGATTTTTCGCTTTTACCGGAAACGGTATC 

CGGAGGAAAAAAAAAACGAAAAAAGCGAAAAATTTTAAGAAGTTTCAAGAT 

TAGTTACAAACTCTTTTCAAAAGCAGATTCTACAGTTTTTTGGGGTTTTGCCA 

AAAAATTTATGAAATATAATGTTTTTTAGACTAGAAAAATAAACTAATTTTAA 

TT1TCAATCAAAAGCTCATTATTATATTTATATTTATATAATTCAGTTGCGAAT 

GCCATCGAACTGGTGGAAACTGTTCGGACAATACTTGTGTGAATCGTGCAAT 

GCTCACCGAGTGCCCATCATCATGTCAGGTCAAATGCAAGAATCAACGATTT 

GCAAAGAAAAAGTACGCGGCTGTTGAAGCATTCCACACTGGAACCGCCAAA 

GGATGTGGACTTCGAGCAGTGAAAGACATAAAAAAAGGAAGATTCATCATTG 

AATATATAGGAGAAGTTGTGGAAAGAGATGATTATGAGAAGAGAAAAACGA 

AATATGCAGCTGATAAAAAGCACAAACATCATTATCTCTGTGATACTGGAGT 

CTACACGATCGACGCAACAGTCTACGGAAATCCATCTCGATTTGTGAATCAT 

AGTTGTGATCCTAATGCTATATGTGAGAAATGGTCTGTACCAAGAACTCCTGG 

AGACGTTAATCGAGTTGGTTTCTTCTCGAAACGATTCATTAAAGCCGGCGAA 

GAAATCACATTTGATTATCAATTTGTCAACTACGGACGTGACGCTCAACAATG 

TTTCTGTGGAAGTGCTTCATGTAGTGGATGGATTGGGCAGAAACCGGAAGAA 

TTTTCATCTGATGAGGATGATGATATTGTGACTACAAGGCATATTAATATGGA 

TGAAGAAGAAGAAGAAAAGTTGGAAGGTCTTGATCATCTTGGAAATCATGAA 

CGGAATGAAGTGATCAAGGATATGTTGGATGATTTGGTCATTCGGAATAAGA 

AGCATGCTAGGAAGGTTATCACAATTGCGGTAAGCATTTATTTGTAGAGAAA 

ATTTAAAAATTAAAGATGG - 

TCCAATTTTTCCTCTGATTCCGAATTTTTAAATGAAAAAATTCAAAAAAATTT 
CCTTGATTTTATGTTTTAACTTGAAATTGCGAATTTCATTTGTACAGATTTTTG 
Ai^CGCCGAATmCGCGGCAGAGAAGCCATGTGTCGATTTTTGAGATTTGTG 
. _ .TATATTTACAAGATTTTG AATCTTCATCGG ATGCTGATTTGCGTTTTTCATCAT 
TATATTATCAAAAAACTAACAATTTGTTCGGTTTTACGGAAATTAACAATATA 
GACTAGACATTTCGTAAATATACACAAATCTCGTAAATCGACACATGGCGTC 
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tctggcgcgaaaattcggcatttgaaaaatcttatgcgggcactaatgaaat 
tcgtgatttcaagctgaaatataaaatcagggaattttccttgcattttttca 
ctcagaacttcggaatcagttgcaaatttggagtcatttgaaaatatttctca 
gatttcggtactccacctttattataatttttaaaattttttaaatgatttt^ 
ttccatgttcaacaaaaaaataaattttcagtctgcaatgaccgattactctc 
aacgtgtggatgtcattcaagaaatcttctcctcagacacctccgtaaccgtt 
caaaaattctatgcaaaagagggaatggctacattgatggctgaatggttgt 
ctgaagatgattattcgctggataatctgaaacttgttcaagctattctcaaa 
gctcttcacactgaactattcgattcgtgcgccaaaaatgatcgactcttacg 
agattctacatcacgatgggtcaatgcgaaaatggatgaatatgttgatata. 
caagtgatagctgattcacttattgcttgtgttgaagatcccgtacaggagta 
caaggatgtttgcaaagttatagaggtatatacatattaatttttaaaaaag 
aatattttttgcatgtcacaaaatatttggaaattttcccgaaaaacccatga 
aatcaaaaaacaaattaaatagtaaaattatttcctcctacgaacatttttcg 
~ atttttggxitrccg atattccttttaaa^ 

taaattttaggtctttttgctcctttttagaagcaatttatatgttttttaaaa 

caaaacttaaaattagcatttttatgggtaatltrctgaacacatttttttttc 

gaaaaaaatggccagaatttcaaccacttctccgtaaaatcgaaattaacta 

attttttctctatacatttttcaaaaaaagactcctcatttattgtattagata 

caaatatatgttttcctcatcaaaatttacgaaatttgttataattttgaattt 

tttttgtttttttttcgaaaaattgaaaattttctaattttgaaacgatattat 

acaatttcagcgccatcaatttaactaattaaataatttcagaaaggtctcgt 

cgaaaacttcacaagagccaaagagatggcctatcggttaaatcaatactgg 

ttcaatcgatcagtgagcttcaaaattccaaaaaagatacgtgatcctgtgc 

caaaagatgttccagtcagacaagaagatgctacaacatcatcacaatctca 

tgataatagtagtagaactgtatcaccgaatcatcgacatcattcatcttcat 

attcaaattcatgttatcaagaacgagaaccatctcatatacgattctttaat 

aatggaaatgatgttcatcaatatcgttttggaggttatcatggaaataacta 

caatgataactatttcagtagaaggcccaataaggattcatatcgagatcgc 

cgtcgatttaatggacgtcgttcgagaagtcgatcaagaagtgtctcaccac 

agaactataaaagaagaaaactcgatgaacatgacaataatcatcgtcagc 

gttctccaattcgtgatcgtcacacatctcccggcggcgaaaagactcctagc 

tcgaataattctggagaacgaaactataaaagactggatattcgaggagctc 

gtataaaaactataaaagaagatttggaagctgctgctgctgctgctgctgc 

tgctgctgtagcatcagaagtgcaagcttatcctcatgaacatacagctgtac 

atcagagtgtttatcagatgccaggttatgagtcttatggttggtttagtttt 

tttaaaaatatcatttaccagggtgccatttttaaaaataaaaataactcgga 

aaatatgtttttaaaaaatttcagaatttctctcatcaacataaaacttgata 

aaaatcgaatttttattattttctaaacattttttcggtttttccgaaaatcaa 

aaaaaaagtttagaaaatagcaaaaaatcagtttattagaaatcaaattttg 

ttcgttttgataagaaaaaacataagaaaacatgttattttcttctgaaaaaa 

gaa^aaatcgaaaaatctatggccttttggcaaaatgttttggaccaaaaa 

acaaaacaaAtagcattaaaa 

ttttctgaaagtcttgcttgtcgtatatcaaataaaaacatttttcaggagta 
tatgatcctgtaaatggtgtctacatgtatcgtcatcctggcgctggttacta 

TCCACCTGCCTATCCACAACAACCGATTATGTTAACAATGGACACTCTTCCAC 
CGAATGATCGTCTTGGTGAACTTTACGAGAAAGCCAGTATCGAGCAGCTAGC 
GTGAGCATTTTTTAGTTTAAACCTTTCGGATTTACCTAGAAAAATGTTACCTTT 
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GACGCAAAATTACGGTAGCAGGTCTCGTCGCGACeGAAATTTTTCAGCGGAG 

TACGGTAGCTTCCCATGAATTTTTTTGCTGAACTTATCTTTCTGATAACAAATA 

GTAACTAAAACATGAAAAACTGAATAAAAATTGATATCTTTACCTTATAGGC 

TCTTTAAGGGCGCAGACACAAAAACTGACCGGCTACCGTAATTTTTCGTCAA 

AAGTCACACATTTCTCAACTGGTGAAATCCGAAAAAATTGAAATTTTTACTAC 

TCGTCCGACTGTTTAGAAAAGATTAAAAAAAAAGAAAAAAAGAATGTCGGTT 

TTTCGAATTTTCGATTTTCAAAGAAAAAAATCAATATTTAAAAATCATTTTCG 

GTAATTTCCCTAAATTTGTAAAATATAATTTCCAATAAATGTTTTTTGTTTTCC 

GGAATTTTAATAAAAAATCAATTTTCGCGTAACAAAAATGCGAAAAAATGAC 

TAGCCACTCGAATATAATAACACATGAAATAAAATTAAAATTATTACAGTCA 

ACGAGATGCAATTGTGAGACAAGAACTTGAGCTGATACGTATTCAAATCGAA 

AGAAAAACTGCTCAAAAAGAAGCGATCAAGGCCGCTTGCCGTCGTGCTAACG 

AAGAAGAAGCTAAACGACAAGAGGCACTTGCAAAGACGAAATATGTTTGGG 

CGATTGCAAAGTCAGAAGGTGGAGAGACGTATTACTACAACAAAATAACAA 

AAGAGACGCAGTGGACAGCACCAACACCAGTTCAAGGTCTTCTCGAACCGGC 

TTGTGGTGCATCTCCTGATACTACAGTTGTCATTGCTGACGAGATTACTGAAG 

AA G A GC AA C AA GCTG AA GTTCTGG A G AA GCCGCGTGTTGTT AAGG AAG AAG 

TTATCGAGCCAGGTTCACAATCTGAAACTCAAAAAGAATCTGCGGAGAAAGT 

TCGAGTTGTTGTACCGAAAGTTGAAGTTGAAAGATCACCGTCGCCAAAATCT 

TCTCGTGATCGTGAGAAGGATCGAGAGAAATCTCGTGAGAAAGATCGTGAAA 

GAGATCGTGACAGAAGAGAAGGTTCAAAACATCGTGATAGTTATCATGGACA 

TCGAAACGGCAGCAGTTCTGTCAGTGAACGACGTATGCGAGAGTTCAAACAT 

GAGCTGGAACGATCCACTCGATCTGCCGTTCGTTCTCGTCTACAACATCAACG 

TGACGCTTCTAGTGATAAGACTACTTGGCTTATTAAGTTAATATATCGAGAGA 

TTTTCAAACGAGAAAGTGCGCAGAGTGGATTTGATTATCGATTCAGTGAGAA 

TACTGATAAGAAGGTAATATTATGGACCAAAAAATAAACAATTGAAAAAAA 

AACCAAAAAAATCTGATGCTTGAATTTAAAAAAAAACAATGAAAGAGTGCA 

ATTTTTTAGGTTTTTTGGTCTTTTTTTTTGGAAAAACCAAAAAATAAATTTTTT 

TCCAAAGTACCAAACTTCATTTTAAAAAATTTTATTTGACATAAAAATTGATA 

ATTTAAAACTAATTTGAACATTTTTCCGCAAAAATTATAGATTTTTCTGCCAA 

TTTTAGATTTTTAACGTTTTTTTTCGGACAATTAATGTTTC 

GAATGAATATGATATCTGATGAAATTCAAAAATAATGCAATTTAAATAGAAA 

ACGGTACAAAAGTTTTGAAAAATTTAGAAGAATTCTAAAAAAAATCCTGTCC 

TTCAGGACAAAATTCAACCTTTTTCTCAAAACACAAAAATTACTTTATATTAT 

TTTTCAGGTGAAAAACTACGTCAAGTCATATATCGACCGAAAACTCGAATCA 

AACGATCTCTGGAAAGAATACTCTCGGCCATGAGCTTTATTTTTTAATTTAAA 

TTTTATAAAAAAATGTTTATGCTTGTTTTTTTCTCTATAGTTCCCTCCTATCCC 

CCCCCTCCCCTATCGCCTAAAAATTGATCTCTGTCTGATTTCACCGATTTCCGT 

TTTATTTGATCCCATTGAACGAGTATATCATCATGTTCCTGAACTTCAACGTTC 

GCACATTTTATTCCCCTAGTTTTATGTCCCCAGAATTGTTTTATACTATCCTGT 

AATCCACCTCAAAATGACAGCCATGAAAAGCTGTTTTTCATGTTTTCTATTTT 

CTTGTTGA-T^3TAT?-TGG<^G€-Te^ 

AAAATGAATTACGGATGTTGAATTTTTAAATTTATTTTTTTAAAGAAAAATTG 

TGGAAGTTTTTCAGATTCTATACTGCTTATTTTTACGCTAAATTTTTTTTCGAA 

GTCCCCTTTTTTCAAATCGAAGTGTAACTGCGCTCCACGATCAATAGAGACTC 

TCCGCCCTCGAACCATGGGTCTCGTTAGGTATTTGGCAGACTTACCGTAAATT 

CAAATGTTTTATTACTTCGCGACTAATTTTTTTATTCATGACTCAATTTTTTAT 

CAATTCCAACGAAAAACTAATTAAAAACAACGGAAAACATAACGAAAAATG 
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CTTGAAAATTGCAGACATTTCCGAAATTAATTAAATTCCTAACGAGACCCATG 

GCTCGGGGGCGGAGTGTTTTCGATTAGCCATGGAGCGCGTTGAGATATTCCT 

AAATTTTTCTATTCAGATGTCGAATCAATCAAAACGGGTCACAGTGAGAATT 

GAGCATTCGAAGAACACTTTTTTCGAAAAGTAATTTTCAAATTTTGATCCAAA 

GAAATTATTCGTCAATTTTCAGAGTTTTAAAATTCCAACATCAAGAGCAAGA 

AGATCGGAAGCTCAAATATGTTCTGCACAAAGCTCACGAGAATCTGAGAAAG 

TGCCCATTCGAGATTCTGACAATTG 
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Figure 24 LIN(n3628) Protein 
MFQRKWLPKIQITEMVQT 

]^PKJKTDANTLNTPSTSSNL\T>DKLLffiSESQDSILTNEADSFLEKEVEEIEDSSDI 

LPDKINSPEKPSVLVKRRSSTIU.KVKTDEDEKDWVN1EVAVLEEKSIQIEPTSPAH 

PEDPQPSTSSLPLVEPffiDlVEPNEPTSSADPPVSl«raaDEDIKEEEPLIKKPASDESES 

MDIANSESGNDSDSSEADPRTIPSFSIPLPDTPPPNFAKRGEIHVDVDQKNSKQSGE 

SQSPWERAREKSASNPLSSPTMSPJ»PvIHFLHPAYQSFTNDSVSPLPPPPPEPAPAJRE 

KVENGGPTTFmTFKKAANIPILKTSAFEQPSSPPPSSSVSSSISLSEVNSSTSIASES 

SPAKIlSSNFDLTAS>ffiLPPPQMVEIPKLSFFNMPPAVRSAEDDSAMTSEEPn.LIJR. 

SPNSATPDDDA^LTTPPPPKMTESEIQALKVATEKVNQVIARMDSEKDVRHRE 

DRDDYDRRRDDRDRRSRKTDSERNDQRGRQREDDERRAREREREVTKRTO 

EEMRLQKQKDEERRKfODEEEPJQKENM 

FELISESKYLTRNANKXKTESLTCECHRTGGNCSDNTCVNRAML 

KNQRFAKKKYAA\^AFHTGTAKGCGLRAVKDKKGRFIffiYIGEVVERDDYEKR 

KTKYAADKXHKHHYLCDTGVYTrDATVYGNPSRFVNHSCDPNAICEKWSVPRT 

PGDV^qRVGFFSKRFIKAGEEITFDYQFVNYGRDAQQCFCGSASCSGWIGQKPEEF 

SSDEDDDIVTTRHimiDEEEEEKLEGLDHLGNHERNEVIKDMLDDLVIRNKKH^ 

RKVITIASAMTDYSQRVDVIQE1FSSDTSVTVQKFYAKEGMATLMAEWLSEDDY 

SLDmKLVQAILK^LHTELFDSCAK^T)RLLRDSTSRWWAKMDEYVDIQVIADS 

LIACVEDPVQEYKDVCKVffiKGLVENFTRAKEMAYRLNQYWFNRSVSFKIPKKI 

RDPWKDVPVRQEDATTSSQSHDNSSRTVSPNHRHHSSSYSNSCYQEREPSHIRFF 

1WGNDVHQYRFGGYHGNNYNDNYFSRRPNKT»SYRDRRRFNGRRSRSRSRSVSP 

QNYKRRKLDEHDNNHRQRSPIRDR^ 

TIKEDLEAAAAAAAAAAVPSEVQAYPHEHTAVHQSVYQMPGYESYGVYDPVNG 

VYMYPHPGAGYYPPAYPQQPIMLTMDTLPPNDRLGELYEKASffiQLAQRDAIVR 

QELELIRIQffiRKTAQKEAIKAACRRA^EEAKRQEAlJ^TKYWAIAKSEAGET 

YYYNKITKETQWTAPTPVQGLLEPACGASPDTTVVIADEITEEEQQAEVLEKPRV 

VKEEVIEPGSQSETQKESPEKVRVWPKVEVERSPSPKSSRDREKDREKSREKDR 

ERDRDRREGSKimDSYHGHRNGSSSVSERRMREFKHELERSTRSAVRSRLQHQR 

DASSDKTTWLIKLrYREIFKRESAQSGFDYFLFSENTDKKVKNYVKSYIDR 

DLWKEYSRP 
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Figure 25 

lin(ri4256) genomic sequence (1 kb of upstream and downstream genomic sequence is 
included in this file). 



GCTTGCATCGAAACTCTTCTCATTATTTACGTGATGATCACATCTTTCGTTGGG 

CTGTACTCCCTTCCGGTTCTTCGTTCTCTTCGACCTGTTCGAAAAGATACTCCA 

ATGCCAACGATAATTATTAATTCTTCAATAGTTCTTGTTGTTGCATCCGCTCTC 

CCAGTAGCTGTTAACACAGTTGGAATGACAACTTTTGATCTTCTCGGCTCCCA 

CTCATCGCTCCAATGGCTTGGATCATTTCGAGTCGTTGTTGCCTATAATACTCT 

ATTCGTCGTGTTGTCTGTCGCATTTCTCTTCAATCAATTGACTGCTTCAATGAG 

AAGGCAAATCTGGAAGTGGTAAGCTGTGCAATTTAAAGTTTAAATTCTTATTA 

ATTTTTTTGCAGGATATGTCAACTACGATGTGGAATCAGACGGGAGAGTGAT 

GCGGATGAAACCATTGAGATCCTTAGAGGCGATAAGAAAAGCAATTGAATTT 

CTTTCCTTTTTCAACACTTCTTACCCATGTT^ 

AAAACAAGGTCCTATTTTTTTTCTCGGGTACTACTCGCCTTTTCTAATAATTCA 
GAATCATCAATTTTTGCCAACCTCTAGCTTTACATGTCTGTTTTTCATCATTTT 
CTCTCAAGCATTCTCCTAATATATTATGTTCCCTAGTATTTCCCCTCAGTCAGC 
AATTTTCTCGTCGTCGAAACCGTTTAGCTTTACTTTCAATCAAAACGTGGAAC 
ATTTTTCAAACTATTTGAAGCCAAAAAAAACCAGGGCTTTTGTATATGTACCA 
TATTTTCCCTCTGATTTTCTTTATCGCCTTCTCTTTTCATGTAGAATAACTGAA 
ATACAAACCATTTTAATTTTTTCTTTTAATTATCAATACTGTCCGTATAGGTAA 
AAATTATTTCTTCAGGTTTGAAAAAATCCGAAATATGTATCTGCAACTCTTCA 
GGGCATTGCCTCAATTAATTTTTATCTAATATTCAGATGGACCAACAAGAACC 
ATCGAATAACGTAGATACGAGCAGTATTCTTTCGGATGATGGGATGGAAACA 
CAGGAACAAAGTTCATTCGTCACTGCTGTGAGTGAAATTATTTAAAATTTCGC 
TTCGGAGATTCATTGTCATATAATTCAATTTATCGATTTTCAGACAATTGACC 
TAACAGTGGACGACTACGATGAAACAGAAATACAGGAGATTCTGGATAATG 
GAAAAGCAGAAGAAGGAACAGATGAAGATTCTGATTTAGTTGAAGGGATTCT 
TAACGCTAATTCAGATGTCCAAGCGCTCCTTGATGCGCCATCTGAGCAAGTA 
GCTCAAGCTCTTAATTCGTTCTTCGGAAATGAGAGTGAACAAGAAGCTGTTG 
CAGCACAAAGACGGGTTGATGCGGAGAAGACTGCCAAAGATGAAGCTGAAC 
-T^AAGGAAGAGGAAGAGGGGGTTAGATr^GAATAAAGGAAAGAATAATAAA 
ATTATTTTATTTTCAGGAAGATCTTATTATAGAAGATTCGATAGTCAAAACTG 
ATGAAGAAAAACAAGCAGTTCGAAGACTGAAAATCAACGAATTTTTATCGTG 
GTTCACAAGGCTCCTTCCAGAACAATTTAAAAATTTCGAATTCACAAATCCGA 
ACTATCTGACAGAATCTATCAGCGATTCACCGGTTGTAAATGTCGATAAATGC 
AAGGAAATTGTCAAATCGTTCAAGGAAAGTGAATCACTTGAGGGACTTTCAC 
AGAAATACGAATTAATTGATGAAGACGTGCTAGTCGCTGCTATTTGTATTGGC 
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Exon boundaries (inclusive) 
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4 
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1001 - 1096 
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GTTCTCGATACCAACAACGAAGAAGATGTCGACTTTAATGTTCTATGTGATGA 

TCGTATCGACGATTGGAGTATAGAAAAATGTGTCACTTTTCTTGATTATCCAA 

ATACTGGATTGAATTCGAAAAATGGACCGTTGAGATTCATGCAGTTTACTGTC 

ACATCACCTGCATCAGCAATTCTCATGCTCACTCTGATTCGATTACGCGAAGA 

AGGGCATCCGTGTCGATTAGATTTTGATTCAAATCCGACTGATGATTTACTCT 

TGAATTTCGATCAAGTGGAATTTTCTAATAATATCATTGATACGGCAGTCAAA 

TACTGGGATGATCAGAAGGAAAACGGTGCGCAGGATAAAATTGGCAGGCGA 

GTATTAATCAAACTCACAACTGTTTTGAAAGTATTTTCATAATTATCACTTAA 

ATACCHTTTAGAGAGCTCAACGACTTCTTCCACGAAATCGAGTCAACATCAGC 

AGAATTCAAACAACATTTTGAGAACGCCGTTGGCAGCCGTAATGAAATAATT 

CAACTTGTCAACGAGAAAATTCCCGATTTTGATGGCACTGAGGCTGCTGTGA 

ATGAGAGTTTTACATCCGATCAACGAACCGAAATTATCAACTCTCGTGCAAT 

AATGGAGACATTAAAAGCCGAGATGAAGCTCGCCATCGCCGAAGCTCAGAA 

AGTTTACGACACCAAGACTGACTTCGAAAAATTCTTCGTTTTGACAGTTGGAG 

ATTTCTGTCTGGCTCGCGCCAATCCTTCTGACGATGCAGAATTAACATACGCC 

ATAGTTCAGGATCGTGTGGATGCAATGACCTATAAGGTTAAATTTATCGACA 

CAAGTCAGATCAGAGAGTGTAACATCAGAGATTTAGCCATGACTACGCAGGG 

AATGTATGACCCGAGTTTGAATACATTTGGTGATGTTGGTGAGTTTTAAGTTA 

AAATTGATATTTAATATTACATCTGTTATGTAGAATAAGGGTTTCGGTTTTTC 

GATTTTATTAGAAAATCGAAAATTTTAGTTTTTGTGTTAAATTTAAAAAAATC 

AAAATTTGATTCACTATCAAGTCCGTTTTTCTCnTCTCAAAATTGACAAAATTT 

TGATAATCTAGAATTTTCGTCCCGTATATTTTTCAACGAAAAACCATTTAAAA 

TTTTCCATGATTGGATTTTCGGTTGATCTAGAAAAAAATGGTGCTAAACACTA 

AATTTGAAAAAGTTTGAAACAAATTCAAATCCAAATATTTCATGAAAAACTT 

GTAAAATATATTATGTACACAAAAAAACGTTTCAAGTGTAGCAGTTGTTTTTT 

GTGGTCCCAAAAAAGCAGATGTTTGTCAGAATCCATTAAACAACAAAAAAAT 

CCAAA AACTC AACCTGGCCTAGATATCAGTTTCATGATCGAAGTATCTAAAA 

TCATTGTTTTCAGGTCTTCGAGTTGCCTGTCGCCAAGTTATTTCCTCGAGCCAA 

TTTGGAAAAAAAACAATTTGGCTTACCGGTACAGCTGCCGGACGTCGCAGAG 

CTCATAGATCCGATTTTCTAATTTTCTTCGACAACGGAACCGATGCATACGTG 

TCAGCTCCGACAATGCCTGGTGAACCAGGTTATGAAGTTGCTTCTGAAAAGA 

AAAGTGTATTTTCTCTCAAAGAAATGATTGCGAAGATGAATGCTGCTCAGATT 

GCTATTATGGTTGGACAGCCAGTAGGAAAGGAAGGAAATCTGGATTATTTTT 

TGACATTTCATTGGATTCGACAATCTCACAGATCAGCGTATATTCGGGATTTT 

ATGAAAGAATTTCCGGAATGGCCACTTCTCAAGATGCCAGTTGGAATGCGAA 

TCTGTTTGTACAATTCTCTTGTTGATCGACGTAAGAAAATGGTGACAGTGATT 

GGAACTGATCGAGCTTTTGCTATTGTGAGACACGAAGCACCGAATCCATTGG 

CTCCTGGGAATAGATGTACAGACTTTCCGTGCAATGATAGAAATCATCAGCA 

TATTGACGAGAAAATCTATAGAGGATCTCATAGATTGGAAGGCGCAGCGGTA 

AGATTTTATTTGAAAAATTGATACAAAACGAGGATTTTCTAAAATTATTTTAT 

TTTTATTTGATTTGATTTCTTATAATTGATAATCAAGGTTTTTTGGATGTTTTG 

TTAGAGAA-ATCGAAAAGGGAAACTIlCCAA^AAAAAGCTGTjS 

TGCTTTTAATAATATCCAAGTTTCATCTTCAAAGTTTTTTCTATAAAATGGACA 

CAAACTITTCAACGTTTTCAAAAAAAAGGTTCCGAAAATATGAAAAAAGGAG 

AAAGAAATCATGAAAATTTTGTATTATTTCAGCACAAGAAGCACATGATCTC 

GACAAATAACAATCTGTCGCAACGCAGAAAAGACCAGCTTCAATCACAGTTC 

GAGCCAACCGACATGATTCGTTCGATGCCAGAGAGGAATCACCAACAAGTCG 

TTAAAAAGAAAACGACGGGCACCAATCAGAATGTCGCTTCGACAAATGATGC 
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AAAATCGAAGAGAGAAATTGAAATAAGAAAGAAAAATCAATTCTTATTTAAC 

AAGATTATTGTTCCAATACCCGTCCTAACACCATTGGAAAATCTCAAGGCTCA 

TGCTCAATGTGGTCCAGATTGTCTACAGAAAATGGATGCGGATCCGTATGAA 

GCAAGATTCCATCGAAATTCACCAATACATACTCCTCTTTTGTGTGGTTGGAG 

ACGAATTATGTACACAATGAGTACTGGAAAGAAGCGGGGAGCAGTGAAGAA 

AAACATTATTTACTTTTCTCCATGCGGAGCCGCTCTTCACCAGATCAGCGACG 

TCTCTGAATATATTCATGTCACCAGAAGTTTATTGACGATTGATTGTTTTTCAT 

TTGATGCACGAATCGATACTGCCACTTATATTACTGTTGACGATAAATATTTG 

AAGGTTGCTGATTTTTCGCTTGGAACCGAAGGAATCCCAATTCCACTAGTGAA 

CAGCGTGGATAACGATGAGCCTCCATCATTGGAATATTCGAAACGACGATTC 

CAATACAATGATCAAGTGGATATATCGAGTGTTAGCCGAGATTTCTGTTCTGG 

ATGCTCTTGTGATGGTGATTGCAGTGACGCATCGAAGTGTGAATGCCAACAA 

TTGTCCATTGAAGCAATGAAACGACTCCCCCATAATTTACAATTCGACGGAC 

ACGACGAATTGTATGAGAGTTCAGAAAAACAAAATAAATTTTTAAAACTATT 

JTTTTJTCAGAGZTCCTCACTATCAAAATjCGICJTC^ 

gtggactctatgaatgcaacgatcagtgttcatgccatcgaaagtcttgttac 
aacagagttgttcagaacaatatcaagtatcctatgcatgtgagtttatttaa 
cgatgatacataccaattattgttttttcttcagatcttcaaaactgctcaatc 
cggatggggagtccgagctttgacggatattcctcaaagtacgttcatttgca 
cgtatgtaggtgctatactgacggatgatttggctgatgaactaagaaatgc 
ggatcaatacttcgctgatttggacttgaaggataccgtggagctggaaaag 
ggtcgcgaagatcatgaaactgattttggttacggaggagacgagtcagatt 
atgatgacgaagaaggaagtgatggtgactccggtgatgatgtaatgaaca 
aaatggtgaaacgtcaagactcttcggagagtggtgaagaaacaaaacggc 
tgacaagacagaaaagaaagcaatctaaaaaatccggtaaaggaggaagtg 
tggagaaagatgacaccactccaagagattcaatggaaaaggataatattg 
aaagtaaagacgaacccgttttcaattgggataagtattttgagccgtttcca 
ttgtatgttatagatgcaaaacagagaggaaatcttggaaggtaagatcaca 
attttattcattaaaaaaattttttagagattttgctttaaatgataaaaaat 
ggacaaaccaaccgtttgcctcttcttttggtttatcaacctttctctatggaa 
aaaattctgaaaaattaacaaacagtatttcacgttgaaaagtgaagaaaaa 
agcaaaaaaaggaaacaaatttcaaaacggttctactccatcttaaaaaaac 
taaaattcgtaaaaagtcatttggtatgttttggagactataatacaattgag 
aaaatttgaaaaaccggcactccaaagatacaatcataaattttcgataact 
ttcagattcttgaatcactcttgcgatccgaatgtgcacgttcaacacgtcat 
gtacgatacgcatgatcttcgtcttccatgggtcgcgtttttcacacgaaaat 
acgtgaaagccggcgatgagctaacctgggactatcaatatactcaagatca 
gacggctaccacacaactcacatgccactgcggagctgaaaactgcaccggc 
cgtttgctgaaaagttaaagaattgttgttatttccttcccagttatgttttcc 
tttttttttaagtatttatttatttatttaatttttattttgtttattgttcaatc 
gtttaaaatctccctttgaaaacagcatctcatatgtatgatctaaacacgta 
_.ttxacctcgtaagggtttgccaaatagt^ 
ctgcgaataaaatgttttaaaaaagac^ 

ttttgatgtctccaatctatttcagtttacaattttaaaatatagaatatatat 
atttaggtttcataagttatgcatcgattacgggttctacgtcacttgaagtt 

. CTGCATTTCCACGTCACATAGGACTACTGTAGTTTTAAAAAATACTCGTTCAT 
TTTGTAATAATATTCCTTCTACTAGTTTTGCTTCTGGTAATAATCGAATTTCAA 
AACTTTAGCTAAAATATTTCTTTTTGAAGAGGCTGCAGCAAAATATGAAAAG 
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AAAAGTCCAACTGAACATGTATTACTTCGACCCGATACATATATTGGAGGTG 

TCGCCATGCGAGAAGATCAAATTATTTGGCTCAGAGACTCAGAAAATAGAAA 

AATGATTGCAAAAGAAGTCACTTATCCACCTGGATTATTGAAGATTTTCGATG 

AGATTCTAGTGAATGCGGCTGATAATAAAGCAAGAGATTCCAGTATGAATCG 

GTTGGAAGTATGGTTAGATAGGTAAATATATTGCAGGAATTTATGTTCTGCGA 

CAAAGCTACGATACGCTGTCTCGCCACGACAATTGTTTTGGTAAATGCATGA 

AAATCGACGTGCACCTTTAAATAATACTGTAGTTTTAAATTCTCGTTTCTTCA 

ATTTTTCATAAATGGTTTTCCGATGAATATATGATTTTAAAAAAATCTAAAAT 

TCACATTAATTTATAAGAAACAAAATTCCTCAAAAACGAAAGTTTGGCGATA 

CAGTACTATC 
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LIN(n4256) amino acid sequence 

MDQQEPSNNVDTSSILSDDGMETQEQSSFVTATroLTVDDYDETEIQEILDNGKA 

EEGTDEDSDLVEGILNANSDVQALLDAPSEQVAQALNSFFGNESEQEAVAAQRR 

VDAEKTAKDEAELKQQEEAEDLIIEDSIVKTDEEKQAVRRLKINEFLSWT^ 

QFKNFEFTNPNYLTESISD SP VVNVDKCKEIVKS FKES ESLEGLS QKYELIDED VL 

VAAICIGVLDTNNEEDVDFNVLCDDRIDDWSffiKCVmDYPNTGLNSKNGPLRF 

MQFTVTSPASAILMLTLIPvLREEGHPCRLDFDSNPTDDLLLNFDQVEFSNNIIDTA 

VKYWDDQKENGAQDKIGRRVLrKLTTVLKNAVGSRNEnQLVNEKJPDFDGTEA 

AVNESFTSDQRTEimSRAFMETLKAEMKLAIAEAQKVYDTKTDFEKFFVLTVGD 

FCLARANPSDDAELTYAIVQDRVDAMTYKWFroTSQIRECNIRDLAMTTQGl^ 

DPSLNTFGDVGLRVACRQVISSSQFGKKTrWLTGTAAGRPvRAHRSDFLlFFDNGT 

DAYYSAPTMPGEPGYEVASEKKSWSLKEMIAKMNAAQIAEvIVGQPVGKEGNL 

DYFLTmwmQSmSAYIRDFMKEFPEWPLLKMPVGMRlCLYNSLVDRPJCKMVT 

VIGTDRAFAIVRHEAPOTLAPGNRCTDFPCNDRNHQHIDEKIYRGSHRLEGAA 

KHMISTNN^SQRRKDQLQSQFEPTDM^ 

TNDAKSKREmiRKKNQFLFmilWIPVLTPLENLKAHAQCGPDCLQ 

AJ^HRNSPfflTPLLCGWRPJlvr^TMSTGKKFlGAVKXNIOT 

YIHVTRSLLTroCFSFDAPJDTATYlTVDDKYLKVADFSLGTEGIPrPLVNSVD>JDE 

PPSLEYSKRRFQYNDQVDISSVSRDFCSGCSCDGDCSDA5KCECQQLSIEAMKRL 

PHNLQFDGHDELYESSEKQNKFLKLFFFRWHYQMILLSSKVISGLYECNDQCSC 

FIRKSCYNRWQNNIKYPMHVSLF>©D^ 

STFICTYVGAILTDDLADELRNADQYFADLDLKDTVELEKGREDHETDFGYGGD 

ESDYDDEEGSDGDSGDDVMNIOVrVKRQDSSESGEETKRLTRQKRKQSKKSGKG 

GSVEKDDTTPRDSMEKDNIESKDEPVFNWDKYFEPFPLYV1DAKQRGNLGRFLN 

HSCDPNVHVQIFVMYDTHDLRLPWAFFTRKYVKAGDELTWDYQYTQDQTATT 

QLTCHCGAENCTGRLLKS 
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lin-65 genomic sequence (1 kb of upstream and downstream genomic sequence is 
included in this file) 



Ex on number 


Exon boundaries ("inclusive") 


1 


1001-1133 


2 


4522-5208 


3 


6128-6361 


4 


7962-8350 


5 


8706-8928 


6 


9260-9516 


7 


10328-10567 


8 


11677-11700 



AAAAATTTAAAAAAATTTTTAAAAATTCGTGTAAAAATTACCCCGGTTGTTTA 
GGAAATAATAAAGAGATTAGAGACTTTTTTCAGATTTTTATTTTCTTGAGm 

CGCTAGTTTTCCCCTCAATTTCTCGATTTTTTC 

GAAAATTGAATTGTTTGCAAAAAAAAAAATTCAAAAACCGCATTTTTCTCAG 
AATTTTTCTGGGATTTTGTACAAATTTTTGAATTATTTCTCAAAAAAAAGCAG 
GTTTTTACCGATTTTTTTGGTTTTTTCCCCAAAATTTTCCGATTTTTTCCGAGTT 
TTGCCGGTTTTCAGCCGAATTCTACTCTCGATTTTTTTACGATTTTTTGGAAAT 
TTTCGGAAAATTATTTGAAAAAAAATCAAAAAACCGCATTTTTTTTTCTGAAT 
TTTCTGGGATTTTGTACGAAATTTTGAAATTTTTCTCGAAAAAAGCAAGTTAT 
TCCCCAAAATTTTCTGATTTTCCCCCAAAAATTTAGATTTTTCCCGAGTTTTCC 
CCAGTTCTCAGCTGATTTCTATATTTTTTTCTCAATTTTTGTGATTTTTTGTT 
TAGTTTTCCCTTCAATTCCTCGAGTTTTTCACGATTTTTTGGAGATTTTCGAAA 
AATTGTTTGAAAAAAATCAAGAAACCACATTTTTCTCTGGATTTTCTCGAAAT 
TTGCACAAAATTTTTGAATTTTTTCGTAAAAAAAAACTGTTTTCCCCAAAAAT 
TTCAGATTTGTTTTTGATTTTTTTCGAGATTTTCCCCTGATTTCAAAGT 
CTGAATTTTTCGAATATTTCCTGAAAAATCGGCTATTTCTAACTTTTTAAATAA 
TTTTTTTTGAATTTCTGACTTTTTAAATCCTTTTTTTTTTGCCATTTTTTCCC 
TAAAATTCTAAATTATTCAAAATTTTACAGAATGTCAGAAGTAATCGACGAA 
AGTATCTTAAATACAGAAGCTTCAGATGATCCAATACCTCCATTAAATGATG 
ATCAGATTGCTGAGCTTTTGGGTGAAGATGGAGAAATTATGGAGATAACTGA 
GCAGAAAGGTGAGATTTTTTGAGTAAAACCTTGAATTTTGCACTAAAAATTTG 
CAATTTTCGCTAAAAATTACCTTAAAACTCGAAAATTGGAATTTCTAGCTGAG 
AAAATGGCCAAAAATGTCGAAAAATGCCTCCGAAACCTGTGAAAAAAAAAA 
CCACCAAAAAGGTTTCTAGGCCACCAAAAAGATTTCTAGGCCACCAAAAATG 
_ JirTGTAGGCiCACCAj^AAATGTTJC^ 

AAAAATGTTTCTAGGCCACCAAAAATGTTTCfAGGCCACCAAACAGGm " 

ATGCCACCAAAAATGTTTCTAGGCCACCAAAAATGTTTCTAGGCCCCCAAAA 

AATTTTTCTAGGCCACCAAAAAGGTTTCTAGGCCACCAAAAATGTTTCTAGGC 

CACCAAAAAGGTTTCTAGGCCACCAAACAGGTTTCAATGCCACCAAAAAGGT 

TTCTAGGCCACCAACCAGGTTTCAATGCCACCAAAAATGTTTCTAGGCCACCA 

AAAAGGTTTCTAGGCCACCAAAAATGTTTCTAGGCCACCAAAAATGTTTCTA 
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GGCCACCAAAAAGGTTTCTAGGCCACCAAACAGGTTTCAATGCCACCAAAAA 
TGTTTCTAGGCCACCAAACAGGTTTCAATGCCACCAAAAAGGTTTCTAGGCC 
ACCAAAAAGGTTTCTAGGCCACCAAAAATGTTTCTAGGCCACCAAAAAGGTT 
TCTAGGCCACCAAACAGGTTTCAATGCCACCAAAAATGTTTCTAGGCCACCA 
AACAGGTTTCAATGCCACCAAAAATGTTTCTAGGCCACCAAACAGGTTTCAA 
TGCCACCAAAAATGTTTCTAGGCCACCAAAAAGGTTTCTAGGCCACCAAAAA 
TGTTTCTAGGCCACCAAAAATGTTTCTAGGCCACCAAAAAGGTTTCTAGGCCA 
CCAAACAGGTTTCAATGCCACCAAAAATGTTTCTAGGCCACCAAACAGGTTT 
CAATGCCACCAAAAATGTTTCTAGGCCACCAAAAATGTTTCTAGGCCCCCAA 
AAAATTTTTCTAGGCCACCAAAAAGGTTTCTAGGCCACCAAAAATGTTTCTAG 
GCCACCAAAAAGGTTTCTAGGCCACCAAACAGGTTTCAATGCCACCAAAAAG 
GTTTCTAGGCCACCAACCAGGTTTCAATGCCACCAAAAATGTTTCTAGGCCAC 
CAAAAAGGTTTCTAGGCCACCAAAAATGTTTCTAGGCCACCAAAAATGTTTC 
TAGGCCACCAAAAAGGTTTCTAGGCCACCAAAAAGGTTTCAAGGCCACCAAA 
AAGGTTTCAATGCCACCAAAAATGTTTCTAGGCCACCAAACAGGTTTCAATG 
CCACCAAAAAGGTTTCTAGGCCACCAAAAATGTTTCTAGACCACCAAAAAGG 
TTTCTAGGCCACCAAACAGGTTTCAATGCCACCAAAAAGGTTTCTAGGCCAC 
CAAACAGGTTTCAATGCCACCAAAAATGTTTCTAGGCCACCAAAAAGGTTTC 
TAGGCCACCAAAAATGTTTCTAGGCCACCAAAAATGTTTCTAGGCCACCAAA 
AAGGTTTCTAGGCCACCAAACAGGTTTCAATGCCACCAAAAATGTTTCTAGG 
CCACCAAACAGGTTTCAATGCCCCCAAAAAATTTTTCTAGGCCACCAAAAAG 
GTTTCTAGGCCATCAAAAATGTTTCTAGACCACCAAAAAGGTTTCTAGGCCAC 
CAAAAATGTTTCTAGACCACCAAAAAGGTTTCTAGGCCACCAAAAATGTTTC 
TAGGCCACCAAAAAGGTTTCTAGGCCACCAAAAATGTTTCTAGGCCACCAAA 
AAGGTTTCTAGGCCACCAAACAGGTTTCAATGCCACCAAAAAGGTTTCTAGG 
CCACCAACCAGGTTTCAATGCCACCAAAAATGTTTCTAGGCCACCAAAAAGG 
TTTCTAGGCCACCAAAAATGTTTCTAGGCCACCAAAAATGTTTCTAGGCCACC 
AAAAAGGTTTCTAGGCCACCAAAAAGGTTTCAAGGCCACCAAAAAGGTTTCA 
ATGCCACCAAAAATGTTTCTAGGCCACCAAACAGGTTTCAATGCCACCAAAA 
AGGTTTCTAGGCCACCAAACAGGTTTCAATGCCACCAAAAAGGTTTCTAGAC 
CACCAAAAAGGTTTCTAGGCCACCAAACAGGTTTCAATGCCACCAAAAAGGT 
TTCTAGGCCACCAAACAGGTTTCAATGCCACCAAAAATGTTTCTAGGCCACC 
AAAAAGGTTTCTAGGCCACCAAuAAATGTTTCTAGGCCACCAAAAATGTTTCT 
AGGCCACCAAAAAGGTTTCTAGGCCACCAAACAGGTTTCAA TGCCA CCAAAA 
ATGTTTCTAGGCCACCAAACAGGTTTCAATGCCCCCAAAAAATTTTTCTAGGC 
CACCAAAAAGGTTTCTAGGCCACCAAAAATGTTTCTAGACCACCAAAAAGGT 
TTCTAGGCCACCAAAAATGTTTCTAGACCACCAAAAAGGTTTCTAGGCCACC 
AAAAATGTTTCTAGGCCACCAAAAAGGTTTCTAGGCCACCAAACAGGTTTCA 
ATGCCACCAAAAATGTTTCTAGGCCACCAAAAATGTTTCTAGGCCCCCAAAA 
AATTTTTCTAGGCCACCAAAAAGGTTTCAATGCCACCAAAAATGTTTCTAGGC 
CACCAAAAAGGTTTCTAGGCCACCAAAAATGTTTCTAGGCCACCAAAAATGT 
..TXCTAGGCCACCAAAAAGGTTTCTAGGCCACCAAACAGGXn:CAAXG.C.CACC 
AAAAATGTTTCTAGGCCACCAAACAGGTTTCAATGCCACCAAAAAGGTTTCT 
AGGCCACCAAAAATGTTTCTAGACCACCAAAAAGGTTTCTAGGCCACCAAAC 
AGGTTTCAATGCCkcCAAAAAGGTTTCTAGGCCACCAAACAGGTTTCAATGC 
CACCAAAAATGTTTCTAGGCCACCAAAAAGGTTTCTAGGCCACCAAAAATGT 
TTCTAGGCCACCAAAAATGTTTCTAGGCCACCAAAAAGGTTTCTAGGCCACC 
AAACAGGTTTCAATGCCACCAAAAATGTTTCTAGGCCACCAAACAGGTTTCA 
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ATGCCACCAAAAATGTTTCTAGGCCACCAAAAATGTTTCTAGGCCCCCAAAA 

AATTTTTCTAGGCCACCAAAAAGGTTTCTAGGCCACCAAAAATGTTTCTAGAC 

CACCAAAAAGGTTTCTAGGCCACCAAAAATGTTTCTAGACCACCAAAAAGGT 

TTCTAGGCCACCAAAAATGTTTCTAGGCCACCAAAAAGGTTTCTAGGCCACC 

AAAAATGCTTCTAGGCCACCAAAAATGTTTCTACGCCACCAAAAGCCG CCTC 

AAGCCCGAAAAATTTGAATTTCCCGCTCAAAAAATCTAAAATTTTCCGATTTT 

CAGACGAATCAGATGATGTGGTGATGCTGGACGACGATGATGACGACACTCC 

GGAACCGATTCTCGTGATTGATATGGATGAGGATGAGGATGTTACTACAGAT 

GGTCCTGAATCTCAGGAAGAGCTGGCTGCAGATGCTCCGGCTCCAGGAGCTC 

CAGAAGCTTCAGCTCCAGCTCAAGAAGCCTCAGAAGCTTCAGCTCCGGATCA 

AGAAGCTCCAGAAGTTCAGGATGTTCCGGATTCTTCGGGAGCTCCAGATGCT 

TCAGCTCAGGCTTCAGAGGCTTCTGATGCTTCAGCTCCAGAAGTTCCAGGATC 

TACAGAAGCTCAGGATGCTCAGGATGTTCCGGATTCTTTGGGAGCTTCAGAT 

GCTTCAGCTGAAGAAATTGCAGAAGCTCCAGAAGCCCCAGAAGCTCCAGAAA 

TCGCCGCTGAAATCGACGAAGAAGTGGTGCTCGCCGAGCAAAATGGAGTTTT 

GGACGAAGGATTTGATGAGACTGACGATATTATCATAGAAGAAGAAGCTGTA 

GAAGAAGCTGAAGCCGTGGAGCCACCAATTAACACTGAAAATCAGGAAAAC 

GCGCTGGAAATGCTCGAAGAGCGCCTCAAGAAGAATGAAGAAAAGGAAATT 

GTGGAGAAAAGTGATGTGAAGCCAGAGGATGAAGATATTATACATATGGAG 

ACGGATTCAGTTGAAAGTATGGGCTTTTTTAGCTGGAAAACAGGAAAAAAGA 

GCAAAAAATTGATACATTTCCAGCTTAACCAATCTTTTTTTGAGTTGTAAAGC 

CTGAAAATTGAGATTTTTGTACCAACTTTTATGATAAAGCTGAAAAAAAAATr 

AATTTTTTGACGAATTTTTAGCGGAAACCCTGAAAACATGTTTTGTCTGAAAA 

ATACAGAAAATCGTCACTTTTTACAATAAATTCGAGATTTTTAGCTCAAAAAT 

ACAACATTATAGTGCAAAAATCTCAGAAAAAGCCAAAAATTTCATTCAAACA 

TCTCAAAAAAAGCAGAAATTTTACTCAAAATATCTCAGAAAAAGCTAAAATT 

TTCCCAAAAAATCCCAGAAAAAGCAGAATTTTCATTCAAAATTCCCAGAAAA 

AGCTGATAATTTACTAAACAATCTCAGAAAATGCTGAAATTTTACTCAAAAG 

TCTTCATAAAAAGCTGAAATTTTACTTTAAAAGTTTAGGAAATGCTGCAATTT 

CACTTAAAAATCCCAAAAAAGCTAAAATTTTCCCAAAAAATCCCAGAAAAAG 

CAGAAATTTTACTCGAATATCTCAAAAAAAAAAAAGCTGAAATTTCACTCAA 

AAATCCCAGAAAAAGCTAAAAATTTACTAAAAAATCTCAAAAAAAAAAACG 

CTAAAATTTCACTCAAAAATCTCAGAAAAAGCTAAAATTTTACTCGAATATCT 

CAAAAAAAAAAACTGAAATTTTCCTAAAAAATTTATGAAAAACCGAAATTTC 

ACTTAAAAGTCTCATAAAAAGCCGAATTTTCCCAAAAAAATCCCAGAAAAAG 

CTAAAAATTTACTTTAAAATCTCATCTGTAATTTTAGTTTAAAATCTCAGAAA 

AACCCGAAATTTCTCTCAAAAATTTGCTGATTTTCAAATTTTCAGCGTCAAGC 

CGCAAACGTACTGGCGGAGCCACAAGTCCGCGGAGCCCGGCTCAAAAACGA 

CCAAAACGACGTGTTCAAACGTTATTAAAGATGCGTCAGAATGCAATTGAAC 

TATTGACACGACTTTATGGCTCATGGGATGCACAATTGAGCCTCTCAAATCTT 

GAGACAATTCGATTGTTGGGTGTCAATAATAATAGGAAGCTTATCGAAATTTT 

t-^K3AGGAGAAJH3A4jCA 

ATTITCGATAAAAAAACGGATTTTTGGAAGAAAATCGCCTGAAAATTCATGT 

TTTTCTGCAAATTTTGACCAAATTCCCAAGAAAAATACGATTTTTTAGTCCGA 

AAATCCTCCAAAAAGATTTCTAGGCCACCAAAAAGGTTTCTAGGGCACCAAG 

AAAGTTXCTAGGCCACCAAAGTATTTATAGGCCACCTAAGATGTTTCTAGGCC 

ACCTGAGATGTTTCTAGGTCACCAAAAATGTTTCTCGGTCACCAAAAATGTTT 

CAAGGCCACCGAAAAGGTTTCTAGGCCACCTAAGTATTTCTAGGCCACCTAA 
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GATGTTTCTAGGCCACCTGAGATGTTTCTAGGTCACCAAAAATGTTTCTAGGT 
TACCAAAAATGTTTCAAGGCCATCGAAAAGGTTTCTAGGCCACCAAAGTATT 
TCTAGGCCACCTAAGATGTTTCTAGGCCACCTGAGATGTTTCTAGGTCACCAA 
AAATGTTTCAAGGCCACCGAAAAGGTTTCTAGGCCACCAAAAAGGTTTCTAG 
GCCACCAAAAATATTTCTAGGCCACCTAAGATGTTTCTAGGCCACCTGAGAT 
GTTTCrAGGCCACCTGAGATGTTTCTAGGCCACCTGAGATGTTTCTAGGTCAC 
CAAAAATGTTTCTCGGTCACCAAAAATGTTTCAAGGCCACCGAAAAGGTTTC 
TAGGCCACCTAAGTATTTCTAGGCCACCTAAGATGTTTCTAGGCCACCTGAGA 
TGTTTCTAGGTCACCAAAAATGTTTCTAGGTTACCAAAAATGTTTCAAGGCCA 
TCGAAAAGGTTTCTAGGCCACCAAAGTATTTCTAGGCCACCTAAGATGTTTCT 
AGGCCACCTGAGATGTTTCTAGGTCACCAAAAATGTTTCAAGGCCACCGAAA 
AGGTTTCTAGGCCACCAAAAAGGTTTCTAGGCCACCAAAAATATTTCTAGGC 
CACCAAAAATGTTTCTAGGTCACCAAAAATGTTTCTAGGTCACCAAAAATGT 
ATCAAGGCCACGAAAAAGGTTTCTAGGTCACCAAAAA-TGTTTCTAGGCCACC 
AAA^TGTTTCTAGGTCACCAAAAATGTTTCTAGGCCACCAAAAAGGTTTCT 
AGGCCACCAAAAAGGTTTCTAGGCCACCAAAAAGGTTTCTAGGCCACCAAAA 
AGGTTTCAAGGCCACCAAAAAGGTTTCTAGGCCACCAAAAATGTTTCTAGGT 
CACCAAAAATGTTTCTAGGCCACCAAAGTATTTCTAGGCCACCTAAAAGGTTT 
CTAGGCCATCAAAAAGGTTTCTAGGCCATCAAAAAGGATTCTAGGCCACCAA 
AAATATTTCTAGGCCACCTAAGATGTTTCTAGGCCACCAGAGTATTTCTAGGC 
CACCTAAGAGGTTTCTGGGCCATCAAAAAGGTTTCAAGTCCATCAAAAAGGT 
TTCTAGGCCACCAAAAAGGTTTCTAGGCCACCGAAAAGGTTTCTAGGCCACC 
AAAAAGGTTTCTAGACCACCTAAGACATTTCTAGGCCAACAAAAAGGTTTCT 
AGGCCACCAAGAAGCCGAAAAACTGTCTCAAATTCGAATTTTGCAGTGCTCA 
AACAAAAAGTGTCCGCACTGACAGAAGAGCTGAAAAAGGAGAAGCTGGCTC 
ACGCGGGAACCCGTTCAGCATTGAAAGAATTGACTAATGAAATAACTGGAAT 
GCGTGTACAAATGAATAAACTACGTTCAATGGTCACTCAGCCTACGACTTCG 
AAAATTATTGATAGTTTTGTTCAACGTCATCAGGCTTTCGAGCAGCAACAACA 
ATTCCAACACCAACACCACCAACACCGACCAATAATGTTGGCTCCACGTCAT 
CATCCGCCGCCGCCCCCGCATTTTACACCGAATCAACGGGCGGCGGCTCCGT 
ATGATCCGAATATGGTTCAACCGAATCGTCTTGCTGCTATGCCACATAGAAGA 
CCGATTATTGGAATGCAGGTGAAAATGGAATGCCATGAAAATTTCGGGCCGG 
AAAATTTTGGAAAATCCrCTAAATTTTCAATATTTGTCGAAAAAATCTGACAA 
AAATCGTGTCAAAATTCAGATTTCCGGGAGAAAAATCGCATTTTTGAGTAAA 
AATTCGAAGAAAAGCGTCTTAAATTCTAGATTTATTAGTTAAAATTTTTTTCA 
AATTTTAGTCAAGAAAATTAAGAAAAATGCGAAAATTTCGAGCAAAAAATAT 
AGTTTTTTGGAGCCGAAATTGTGAAAAATGCGATTTTTTTCGAAAAATCTGGA 
CAAAAAATTTCAAACAAGAAAAACCACTTTTTTAAAAAAATTTTCACACAAT 
TTCCAGCAACAAAATTCGGCTCCACCACAATTCAACGGTCACCAAGCTCTCGT 
CCCATCACCTCAATCATCATCrGCATTTTCTCGTCCACCACCAACTCAACTTG 
CAACACAGAGAAGAGCTCCACCATTGGCAAGTACCGGCCTTCCGGCAACAGT 
. ... rAGAXGGGAAGCA^IXCCACCGrCAAAA^TCCjGAATCICaGGCACAATGA 
GCCACCGCTTAACAATGGAGGTTCGTCGTGTGCAACAAAAAGAGCACCGCTT 
TTCCACGACGAGTTTTTGCGATGATGATTTTGGTGTGAAAATTGAAAAACTCA 
TTTTTTTAAAGTCTGAAATTTGAAAATTTGAGAAAAGTTTTTTAAAAAAAGTT 
TTATGAGGGATTTTCTGACAATTTTTTATAAACGGAAAATTACGAAAACTCCA 
AAATTTGTGTTCTTTCGGAAAACGAATTTGAAATTTGAACCAAAATTTTGACA 
ATTTTCTGGGGATTTTTGACTGGAAATTCGTTTTTCATCGATTm 
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AATTTTCGGTAAAACCCCTGTCTCCAATTCCAGGCCGTGCACAGCCACTAATC 
GATAATACACGTGTACACGACAATACAATTATGCTGTGTGTACCACTTGTCTC 
CACTGCAAATACAATATCATCGGGCGATTCGACACGTCTACCAAAAGTACCA 
CGAATCTACGAGAATCTCACGGCAAATCCCGATTTGAGTGTGACGATTCATTC 
GAGTGCACAGGATTTCCGAGAGAATTATCAAATTGGTGGAAA GATT AACTAT 
GAATATCTCGGAGGATTTGATCAATATGTAGGTGATGATGTTTTTTTATTGAG 
AGATAAATACGAAATTCCATTACAATCGATATTTTTTGACTGAAAAATGTCTG 
AAAAATCAAAAATTTTAGCTAAAAATTGAGAATATTTTTGTTTAAAAAAAAT 
CATTGAAATTGATTTTTTTTTATTCCATAAAAATCTCGGAAAAGTCA ATTTT C 
AGTCATAAATCTTCTGAAAATTATCCAAACAATGGGATTTTCTGAAATTTTAG 
CTTAAAAATTGAGGATTTCCCGGTTTTTTCAGAGAAATTCCATTACAATCGAT 
TTTTTTACTGAAAAATCCTCTGGAAATTAACAAAAACCAAATAAAATGCCCT 
AATTTTTTTTTAAATCCAAAAATTGTTGGATTTTTTCAGAAAAAAATATTTTTT 
CAATTGACTGGTGTGCAAAAAATATAGAAAATTCAAATTTTCCAAGAAAAAT 
^GCCAAAAAAATGrAATTTTTGTCTAACAAAAAAATTGAATAGCGCAAAAT^ 
AAATTGTCGTTTTTTTTAATTTCCCTCCGGTTTTGAAAGGAAAAAATTCCATA 
AAAATCGAAATTTTTTGACTGAAAAATCCATGAAAACTCGAATTTTGAGTCA 
AAAATCCTCTGAAAATGCTCCAAAATATGAGATTTTCTGAAATTTCATCAAAA 
ATTAAGAATTTCACGGTTTAAAAAAAATTCCATTAAAATCGATATTTTTCAAG 
TGAAAAATCrCTGGAAAACTCGATGTTTGAGTCAAAATTCGTCTGAAAATGC 
TCCTTTAAATTGAAAAATTGAAAAAAAAACCGCCCACAATATTTGCAGAATA 
TCCAAGTGTTCGTCCAAGTGTCATCTCTTAAATTCACTGGAATGAACGGTTAC 
CCGGATCCAGAAGATCGTATATCAATTGACTGGGGATGCTCGAAATTGTGGC 
CTTGTAAGCCGAAATCTCATCACAAATTCCGTGTACGCTTCCATCAAGCACAA 
CTGCTGCCGAAGAACGATCGAATTACGATTGTGGCTGTGGCGAAGGATAAAA 
CTAGCGGAATTATTCACATTTCGCAGGTGAAAAATTGGAAAATT TGC ACAAA 
TCCAGACAAAAAAAACTGAAAAATCGAAAAAATTTTTGTAATTTTTTGCCGA 
AAACGAAAATTAAAAACTGATAAAAATTGATTTTTAACCGGAAAATCCCTGA 
AAAATCAAACATTTTTTGCTAAAAATTGAGAATTATACGGTTTTGGGTAAAA 
AAAAACTATTTAAAAAAAATATTTTTTCTTTAAAAATCTCAACAAAAAAAAA 
ACCAATTTTCATTCAGAAATCCCCCCGGAGAATTGTCAAAATTTTGGGAATAC 
TCTGAAATTTCGATAAACACCTCATTTTTGATTAAAATTGATTTTTTAACTGA 
AAAATCCCTTAAAAAACGAATATTTTAGTTTTTTCACAAAAAAATGTGCAATT 
TATCTGAAATTTCAGCAAAAAAAATGAAAAAAAAAAATTCCGAAATTAAAA 
ACTGATAAAAATCGATTTTTTACTTGAAAAATTCGTGAAAAATCAAACACATT 
TTTGCTAA CCATTG A G AATATT A CG ATTTTGTG A AA A A A AAAACC ATT AAAA 
rrGATmTTATTCCTAAAAAATGCCAGAAAAATCAATTTTCAGTCAAAAATC 
ACCGGAAAATTATCAAAATTTTGAGGTTTTCTGTGAAATTTCAAGCTGAAATT 
TCCATTTTTGAATAAAAAAAATGTGGCTGGATTTAAAAAAAAACCATTAAAA 
TTGATTTTTTAACTGAAAAATCCGTATTTCTCTGAAATTTCAGGCAAAAAATG 
TCATTTCCGAAATTAAAAATTGCGACAAAATCAAATAAAATTGATCAAATTT 

AAAAACTCGAAfTTTC^ 

GATTTTTTGAAATTTTAGCTAAAAATTGAGAATTGCACGGTATTTAGAGAGG^ 

AAAAATTCCATAAAAATCGATATTTrCCTCTTTAAAATCT CGAA AAAAATCAT 

CAATTTTCATrCAAAAATCCCCCCCGGAAAATTGTCAAAATTTTGAGATTTTT 

CTGAAATTTCACGCAAAAATTTTCATTTTTTCAGCCCACCTTCATCACTCTCGA 

ATGATCGATCTCTTCACGTCAAATGCACTTTTTTCTGGATTTTrTTGTTAAAAA 
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TAATTTTTTTTTTGTTGATTTTCTTAATTTTTTTAATTTTCAAAAAATCTTTTTC 

ATCTCTrTCTCTCTCrCTCTGAATCTCAATTTTTTCCTGAATTTCCCCGTT^ 

TCTGATAATTTTCAATATTTCTCTGAATTTTTCTATTCCCCCCGTTGTAATGCC 

AAAATATGTGGTAATTTCTCCCCATTTTTTCGCTTTATTACTATTTATTCTATT 

CAATTGGTGCCTCTCTCAATGTGTTGTATGAAAAACACTGTTTTATGGAGGTT 

TTGGAGAATTTTGAATTTTTTCGTCGTGATTTTTATTGGTTTTCTTTACCAA^ 

CAATTTTTTTTTTAATTCGAAAATTTGTAGAAATTCACTTTTGTAGCTTAAAAA 

ATTAAAAATTGAGAAAATTTGTTCAAAAATGGCAAAGTTTTCGAAATTTTAGT 

CTAAAAAAAGATTTTTTTAATATAGAATTTTAAAAAATTAGCACAGAAAAAT 

GCCGAAAAATTCGTAATTTTTCATTTAAAAATGAAAAAAAAAAAAAACAAAA 

AAAAAAAAAAAAAAGAGGGAAAAATCCCATTAAAAGTAGTTTTTTGACTGC 

AAAATCGTCTGGAAATTAACAAAATTTAAAAAAATCTTTTTTACAGCCCATCG 

TTTCCAAAAACCAAATAAAATGCCAAAAAAAAATTTTTATGCAAAAATTCTG 

GATTTTTTTCCGAOTTTTCAAAAAATTCCCCCTTCTAAAAAAAATGGTGAAT 

TTGTTCCCAAAAACCCAAAATTTGAGATTTTCTAAAATTTTGGCAAAAATTAA 

GAATTTCACGGTTTTGAGAGGGAAAAACTCCATTAAAATTGATGATTTTATGA 

CTAAAAATTCCTAAAAAATCAATTTTCAGTCAAAAATTAAATTT 
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DQEAPEVQDVPDSSGAPDASAQASEASDASAPEVPGSTEAQDAQDVPDSLGASD 

ASAQEIPEAPEAPEAPEIAAEIDEEVLLAEQNGVLDEGFDETDDIIIEEEAVEEAEA 

VEPPINTENQENALEMLEERLKKNEEKEWEKSDVia>EDEDIIHMETDSVETSSRK 

RTGGATSPRSPAQKW>KPJIVQTLLKMRQNAIELLTRLYGSWDAQLSLSNLETIRL 

XGVNNNRKLIEIFEENEQVLKQKVSA^ 

RVQMNKLRSMVTQPTTSKIIDSFVQRHQAFEQQQQFQHQHHQHRPMLAPRHHP 

PPPPHFTPNQRAAAPYHPNMVQPNRLAAMPHRRPnGMQQQNSAPPQFKGHQAL 

VPSPQSSSAFSRPPPTQLATQRRAPPlASTGLPATWWEAffPPKNPNVGHNEPPL 

NNGGRAQPLIDNTRVHDNTIMLCWLVSTAATlSSGDSTRLPKVPRiyENLTANPD 

LSVTiHSSAQDFRENYQIGGOSro 

EDPJSIDWGCSKLWPCKTKSHHKFRVPJ^QAQLLPKNDRITIVAVAKDKTSGIIHI 
SQPTFITLE 
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1 aaggaattag actctttatc taaagtgaag 
61 gaattaaata taaatggatc tcctggggca 
121 aaaactgatg ctgttttaat gacttctgat 
181 ttggtcaaag catgcatgct ttcatcaaat 
241 aaagacttgg atgatacctg catgctgcat 
301 gaacctctgg tgtcaccaca ccaagataaa 
361 tattccaaaa cagtagttaa agaaccagtt 
421 gattcagaca tatactgtac tttgaacgat 
481 gaaaatattg agccttcagt tatgaagatt 
541 gaatcaaaac cagttatatg tgatagtaga 
601 gaagaatata agcagagcat cggtagcact 
661 ttatatcaac ctattgggag ttcaggtatt 
721 ataaaggtgg acagtctaac tctcttgaaa 
781 gcagtgctaa agagtaaaaa aagttcagag 
841 gtagaagtag gtagtgacct tcctgattca 
901 cgtaataatg ggttatctgg gaaatgtttg 
-•961 ttgcctgaaa gaagaggaag accagaaatc 
1021 gtgcatactt ctgatgactc agaagttgta 
1081 gaagacagtg atggtgtaac ttatgcatta 
1141 attgtgtcta cagttcatga agattattct 
1201 gattcagaag atacagattc ggatgatagc 
1261 gtggttgtgc caaagaattc tactttgccc 
1321 agcagtcaaa gttatagaca ctattctgac 
13 81 agacatttgt atgaggaaaa atttgaaagt 
1441 aagtttttcc ttcataaagg aacagagaag 
1501 agaaaacaaa tagataaccg cctgcctgaa 
1561 agtacaagtc atacagatgt gaaatctgac 
1621 gtgaaagcca aaataccttc taggcagcaa 
1681 gaagatgtcc caaataagtc ttggcaacag 
1741 ctgggaaaaa cagaattgag tttttcttcc 
1801 cactcatcag aagagctcag aaacttaggt 
1861 acgtatcagc aacctgacag tagctatgga 
1921 gcagaacagt atggtgggac acgtgattac 
1981 tcaggtagac ctcctggaac tggggttgtg 
2041 tccctaacag atgatcgtga agaagaggag 
2101 tcagaccagt ccgataaatt tcttctatcc 
2161 cctgaaataa gcagcaattc cattaaggac 
2221 tcaaaaaact tagaaaaaaa tgatatcaaa 
2281 gaaatagaga gtgattctga aagtgatggt 
2341 gaggtagagc agggagagac atcagtgccc 
2401 gtcatggatg acttcaggga cccacagcga 
2461 ccatgttact ttgatcttat tgaagaaaat 
2521 tctcatcgag atattaagcg aatgcagtgt 
2581 gctcaaggtg aaatagcatg tggggaagat 
2641 tcttctcggt gtccaaatgg ggattattgt 
2701 gcagatgtgg aagtcatact cacagaaaag 
2761 cttccttcga acacctttgt cctagaatat 

' ' 2*821* aaagct'cgag "tgaa^¥gta~fgcacgaaac 

2881 aagaatgatg agataataga tgccactcaa 
2941 agctgtgaac caaattgtga aacccaaaaa 
3001 ttttttacca ccaaactggt tccttcaggc 
3061 agatatggaa aagaagccca gaaatgtttc 
3121 ggaggagaaa acagagtcag catcagagca 
3181* cgtaagaagg attcagtgga tggagagcta 
3241 tctgataaaa accaggtgct cagcttatcc 



aatgatcaat taagaagttt ttgtcccata 
gaatctgatt tggcaacatt ttgcacttct 
gatagtgtga ctggatcgga- attatcccct 
ggatttcaga atattagtag gtgcaaagaa 
aagaagtcag aaagcccatt tagagaaaca 
ctcatgtcta tgccagttat gactgtggat 
gatacgaggg tttcttgctg caaaaccaaa 
agcaaccctt ctttgtgtaa ctctgaagct 
tcttcaaata gctttatgaa tgtgcatttg 
aatttgacag atcactcaaa atttgcatgt 
agttcagctt ctgttaatca ttttgatgat 
gcttcatctc ttcagagtct tccaccagga 
tgcggagaga acacatctcc agttctggat 
tttttaaagc atgcagggaa agaaacaata 
ggaaagggat ttgcttccag ggagaacagg 
caagaggctc aagaagaagg gaattccata 
tctttagatg aaagaggaga aggaggacat 
ttttcttctt gtgatttgaa tttaaccatg 
aagtgtgaca gtagtggtca tgccccagaa 
ggctcttctg aaagttcaaa tgatgaaagt 
agtattccaa gaaaccgtct ccagtctgtt 
atggaagaaa caagtccttg ttcttctcgg 
cattgggaag atgagagatt ggagtcaagg 
atagcaagta aagcctgtcc tcaaactgat 
aatccggaaa tttcttttac acagtccagt 
ctttctcatc ctcagagtga tggggttgat 
cctctgggtc acccaaattc agaggaaacc 
gaagagctgc caatttattc ttctgatttt 
accactttcc aaaacaggcc agatagtaga 
tcttgtgaga taccacatgt ggatggcttg 
tgggacttct ctcaagaaaa gccttctacc 
gcttgtggtg gacacaagta tcagcaaaat 
tggcaaggca atggttactg ggatccaaga 
tatgatcgaa ctcaaggaca agtaccagat 
aattgggatc aacaggatgg atcccatttt 
cttcagaaag acaaggggtc agtgcaagca 
actttagctg tgaatgaaaa gaaagatttt 
gatagagggc ctcttaaaaa aaggaggcag 
gagcttcagg acagaaagaa agttagagtg 
ccaggttcag cactggttgg gccctcctgt 
tggaaggaat gtgccaagca agggaaaatg 
gtttatttaa cagaaagaaa gaagaataaa 
gagtgtacac ctctttctaa agatgaaaga 
tgtcttaatc gtcttctcat gattgaatgt 
tccaatagac ggtttcagag aaaacagcat 
aaaggctggg gcttgagagc tgccaaagac 
tgtggagagg tactcgatca taaagagttt 
aaa&acatcc "aTttactattt" cat'ggccctef 
aaaggaaatt gctctcgttt catgaatcac 
tggactgtga acggacaact gagggttggg 
tcagagttaa cgtttgacta tcagttccag 
tgcggatcag ccaattgccg gggttacctg 
gcaggaggga aaatgaagaa ggaacgatct 
gaagctctga tggaaaatgg tgagggtctc 
cggctaatgg ttagaattga aactttggag 
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3301 cagaaactta cctgtctgga actcatacag aacacacact cacagtcctg cctgaagtcc 
3361 tttctggaac gtcatgggct gtctttgttg tggatctgga tggcagagct aggtgacggc 
3421 cgggaaagta accagaagct tcaggaagag attataaaga ctttggaaca cttgcccatt 
3481 cctactaaaa atatgttgga ggaaagcaaa gtacttccaa ttattcaacg ctggtctcag 
3541 actaagactg ctgtccctcc gttgagtgaa ggagatgggt attctagtga gaatacatcg 
3601 cgtgctcata caccactcaa cacacctgat ccttccacca agctgagcac agaagctgac 
3661 acagacactc ccaagaaact aatgtttcgc agactgaaaa ttataagtga aaatagcatg 
3721 gacagtgcaa tctctgatgc aaccagtgag ctagaaggca aggatggcaa agaggatctt 
3781 gatcaattag aaaatgtccc tgtagaggaa gaggaagaat tgcagtcaca acagctactc 
3841 ccacaacagc tgcctgaatg caaagttgat agtgaaacca acatagaagc tagtaagcta 
3901 cctacatctg aaccagaagc tgacgctgaa atagagctca aagagagcaa cggcacaaaa 
3961 ctagaagaac ctattaatga agaaacacca tcccaagatg aagaggaggg tgtgtctgat 
4021 gtggagagtg aaaggagcca agaacagcca gataaaacag tggatataag tgatttggcc 
4081 accaaactcc tggacagttg gaaagaccta aaggaggtat atcgaattcc aaagaaaagt 
4141 caaactgaaa aggaaaacac aacaactgaa cgaggaaggg atgctgttgg cttcagagat 
4201 caaacacctg ccccgaagac tcctaatagg tcaagagaga gagacccaga caagcaaact 
4261 caaaataaag agaaaaggaa acgaagaagc tccctctcac caccctcttc tgcctatgag 
4321 cggggaacaa aaaggccaga tgacagatat gatacaccaa cttctaaaaa gaaagtacga 
-4381— attaaagacc gcaataaact ttctacagag gaacgccgga agttgtttga gcaagaggtg 
4441 gctcaacggg aggctcagaa acaacagcaa cagatgcaga acctgggaat gacatcacca 
4501 ctgccctatg actctcttgg ttataatgcc ccgcatcatc cctttgctgg ttacccacca 
4561 ggttatccca tgcaggccta tgtggatccc agcaacccta atgctggaaa ggtgctcctg 
4621 cccacaccca gcatggaccc agtgtgttct cctgctcctt atgatcatgc tcagcccttg 
4681 gtgggacatt ctacagaacc cctttctgcc cctccaccag taccagtggt gccacatgtg 
4741 gcagctcctg tggaagtttc cagttcccag tatgtggccc agagtgatgg tgtagtacac 
4801 caagactcca gcgttgctgt cttgccagtg ccggcccccg gcccagttca gggacagaat 
4861 tatagtgttt gggattcaaa ccaacagtct gtcagtgtac agcagcagta ctctcctgca 
4921 cagtctcaag caaccatata ttatcaagga cagacatgtc caacagtcta tggtgtgaca 
4981 tcaccttatt cacagacaac tccaccaatt gtacagagtt atgcccagcc aagtcttcag 
5041 tatatccagg ggcaacagat tttcacagct catccacaag gagtggtggt acagccagcc 
5101 gcagcagtga ctacaatagt tgcaccaggg cagcctcagc ccttgcagcc atctgaaatg 
5161 gttgtgacaa ataatctctt ggatctgccg cccccctctc ctcccaaacc aaaaaccatt 
5221 gtcttacctc ccaactggaa gacagctcga gatccagaag ggaagattta ttactaccat 
5281 gtgatcacaa ggcagactca gtgggatcct cctacttggg aaagcccagg agatgatgcc 
5341 agccttgagc atgaagctga gatggacctg ggaactccaa catatgatga aaaccccatg 
5401 aaggcctcga aaaagcccaa gacagcagaa gcagacacct ccagtgaact agcaaagaaa 
5461 agcaaagaag tattcagaaa agagatgtcc cagttcatcg tccagtgcct gaacccttac 
5521 cggaaacctg actgcaaagt gggaagaatt accacaactg aagactttaa acatctggct 
5581 cgcaagctga ctcacggtgt tatgaataag gagctgaagt actgtaagaa tcctgaggac 
5641 ctggagtgca atgagaatgt gaaacacaaa accaaggagt acattaagaa gtacatgcag 
5701 aagtttgggg ctgtttacaa acccaaagag gacactgaat tagagtgact gttgggccag 
5761 ggtgggagga tgggtggtca ggtaagacag actctaggga gaggaaatcc tgtgggcctt 
5821 tctgtcccac ccctgtcagc actgtgctac tgatgataca tcaccctggg gaattcaacc 
5881 ctgcagatgt caactgaagg ccacaaaaat gaactccatc tacaagtgat tacctagttg 
5941 tgagctgttg gcatgtggtt agaagccatc agaggtgcaa gggcttagaa aagaccctgg 
6001 ccagacctga ctccactctt aaacctgggt cttctccttg gcggtgctgt cagcgcacag 
6061 acccatgcgc atccccaccc acaacccttt accctgatga tctgtattat attttaatgt 
6121 atatgtgaat atattgaaaa taatttgttt tttcctggtt tttgtttggt tttcgttttg 
6181 cttttagcct ctacatgcta ggatcacagg aagactttgt aaggacagtt taagttctcc 
H52^^~^clia~g§ra^ ggctgccttg^ggttttggrc- 
6301 cagccttgtg ctatgttgat aagattgatt tactgcttaa aatcacttta ctttatccaa 
6361 tttttactga actttttatg taaaaaaata aaatcaatta aag 
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Figure 30 

KELDSLSKVKNDQLRSFCPIELNINGSPGAESDLATFCTSKTDAVLMTSDDSVTGSELSPLVKACMLSSNG 

FQNISRCKEKDLDDTCMLHKKSESPFRETEPLVSPHQDKLMSMPVMTVDYSKTWKEPVDTRVSCCKTKDS 

DIYCTLNDSNPSLCNSEAENIEPSVMKISSNSFMNVHLESKPVICDSRNLTDHSKFACEEYKQSIGSTSSA 

SVNHFDDLYQPIGSSGIASSLQSLPPGIKVDSLTLLKCGENTSPVLDAVLKSKKSSEFLKHAGKETIVEVG 

SDLPDSGKGFASRENRRNNGLSGKCLQEAQEEGNSILPERRGRPEISLDERGEGGHVHTSDDSEWFSSCD 

LNLTMEDSDGVTYALKCDSSGHAPEIVSTVHEDYSGSSESSNDESDSEDTDSDDSSIPRNRLQSWWPKN 

STLPMEETSPCSSRSSQS YRHYSDHWEDERLESRRHLYEEKFESIASKACPQTDKFFLHKGTEKNPEISFT 

QSSRKQIDNRLPELSHPQSDGVDSTSHTDVKSDPLGHPNSEETVKAKIPSRQQEELPIYSSDFEDVPNKSW 

QQTTFQNRPDSRLGKTELSFSSSCEIPHVDGLHSSEELRNLGWDFSQEKPSTTYQQPDSSYGACGGHKYQQ 

NAEQYGGTRDYWQGNGYWDPRSGRPPGTGWYDRTQGQVPDSLTDDREEEENWDQQDGSHFSDQSDKFLLS 

LQKDKGSVQAPEISSNSIKDTIAVNEKKDFSKNLEKNDIKDRGPLKKRRQEIESDSESDGELQDRKKVRVE 

VEQGETSVPPGSALVGPSCVMDDFRDPQRWKECAKQGKMPCYFDLIEENVYLTERKKNKSHRDIKRMQCEC 

TPLSKDERAQGEIACGEDCLNRLLMIECSSRCPNGDYCSNEIRFQRKQHADVEVILTEKKGWGLRAAKDLPS 

NTFVLEYCGEVLDHKEFKARVKEYARNKNIHYYFMALKNDEIIDATQKGNCSRFMNHSCEPNCETQKWTVN 

GQLRVGFFTTKLVPSGSELTFDYQFQRYGKEAQKCFCGSANCRGYLGGENRVSIRAAGGKMKKERSRKKDS 

VDGELEALMENGEGLSDKNQVLSLSRLMVRIETLEQKLTCLELIQNTHSQSCLKSFLERHGLSLLWIWMAE 

LGDGRESNQKLQEEIIKTLEHLPIPTKNMLEESKVLPIIQRVJSQTKTAVPPLSEGDGYSSENTSRAHTPLN 

TPDPSTKLS TEADTDTPKKLMFRRLKI I SENSMDSAI S DAT SELEGKDGKEDLDQLENVPVEEEEELQSQQ 

LLPQQLPECKVDSETNIEASKLPTSEPEADAEIELKESNGTKLEEPINEETPSQDEEEGVSDVESERSQEQ 

PDKTVDISDLATKLLDSWKDLKEVYRIPKKSQTEKENTTTERGRDAVGFRDQTPAPKTPNRSRERDPDKQT 

QNKEKRKRRSSLSPPSSAYERGTKRPDDRYDTPTSKKKVRIKDRNKLSTEERRKLFEQEVAQREAQKQQQQ 

MQNLGMTSPLPYDSLGYNAPHHPFAGYPPGYPMQAYVDPSNPNAGKVLLPTPSMDPVCSPAPYDHAQPLVG ' 

HSTEPLSAPPPVPWPHVAAPVEVSSSQYVAQSDGWHQDSSVAVLPVPAPGPVQGQNYSVT/iDSNQQSVSV 

QQQYSPAQSQATIYYQGQTCPTVYGVTSPYSQTTPPIVQSYAQPSLQYIQGQQIFTAHPQGWVQPAAAVT ■ 

TIVAPGQPQPLQPSEMWTNNLLDLPPPSPPKPKTIVLPPNWKTARDPEGKIYYYHVITRQTQWDPPTWES 

PGDDASLEHEAEMDLGTPTYDENPMKASKKPKTAEADTSSELAKKSKEVFRKEMSQFIVQCLNPYRKPDCK 

VGRITTTEDFKHLARKLTHGVMNKELKYCKNPEDLECNENVKHKTKEYIKKYMQKFGAVYKPKEDTELE 
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Confidently predicted domains, repeats, motifs and features: 



name 


begin 


end 


E-value 


PfamrAT hook 


47 


60 


1.80E+01 | 


low complexity 


230 


243 


.- 


low complexity 


327 


338 


- 


low complexity 


371 


400 


- 


low complexity 


505 


530 


- 


coiled coil 


549 


621 


- 


AWS 


636 


682 


8.80E-18 


SET 


683 


811 


6.00E-41 


PostSET 


812 


828 


7.40E-04 


low complexity 


1080 


1093 


- 


low complexity 


1118 


1129 




low complexity 


1138 


1158 




low complexity 


1271 


1287 




WW 


1361 


1393 


4.10E-08 


low complexity 


1447 


1468 




low complexity 


1469 


1497 





These features and domains are not shown in the diagram, either because their scores are 
less significant than the required threshold, or because they overlap with some other 

source of annotation: 



name 


begin 


end 


E-value 


reason 


low complexity 


36 


50 




overlap 


low complexity 


532 


554 




overlap 


low complexity 


569 


615 




overlap 


Pfam:SET 


677 


811 


8.80E-48 


overlap 


low complexity 


734 


739 




overlap 


' Pfam:WW 


1362 


1391 


1.90E-08 


overlap 



Figure 31 LIN(n3628) Functional domains 
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Confidently predicted domains, repeats, motifs andfeatures: 



name 


begin 


end 


E-value 


low complexity 


387 


411 




low complexity 


435 


449 




AWS 


845 


900 


7.50E-30 


SET 


901 


1024 


3.10E-41 


PostSET 


1025 


1041 


2.50E-05 


low comolexitv 


1262 


1286 


m 


low comolexitv 


1333 


1344 




low romolexitv 


1425 


1437 




coiled coil 


1468 


1491 


• 


low comolexitv 


1569 


1589 




low comolexitv 


1605 


1619 




low complexity 


1622 


1643 




low complexity 


1690 


1710 




WW 


1741 


1773 


2.10E-11 



These features and domains are not shown in the diagram, either because their 
scores are less significant than the required threshold, or because they overlap 
with some other source of annotation: 



name 


begin 


end 


E-value 


reason 


PfanxSET 


895 


1024 


6.30E-52 


overlap 


low comolexitv 


1477 


1493 




overlap 


low complexity 


1726 


1744 




overlap 


PfamrWW 


1742 


1771 


6.90E-12 


overlap 



Figure 32 KIAA1 732 Domains 
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SEQUENCE LISTING 

<110> MASSACHUSETTS INSTITUTE OF TECHNOLOGY et al . 

<120> RB PATHWAY AND CHROMATIN REMODELING 
GENES THAT ANTAGONIZE LET-SO RAS SIGNALING 



<130> 01997/548WO3 

<150> 60/437,821 
<151> 2003-01-02 

<150> 60/410,160 
<151> 2002-09-12 

<160> 36 

<170> FastSEQ for Windows Version 4.0 

<210> 1 
<211> 853 
<212> PRT 

<213> Caenorhabditis elegans 



<400> 1 



Met 


v GL -L 


Thr 


Ala 


2\ cr\ 


Glu 


Thr 


Val 


Leu 


Ala 


Thr 


Thr 


Thr Asn 


Thr 


Thr 


1 








5 










10 








15 




Ser 


Met 


Ser 


Val 


Glu 


Pro 


Thr 


Asp 


Pro 


Arg 


Ser 


Ala 


Gly Glu 


Ser 


Ser 








20 










25 








30 






Ser 


Asp 


Ser 


Glu 


Pro 


Asp 


Thr 


He 


Glu 


Gin 


Leu 


Lys 


Ala Glu 


Gin 


Arg 






35 










40 










45 






Glu 


Val 


Met 


Ala 


Asp 


Ala 


Ala 


Asn 


Gly 


Ser 


Glu 


Val 


Asn Gly 


Asn 


Gin 




50 










55 










60 








Glu 


Asn 


Gly 


Lys 


Glu 


Glu 


Ala 


Ala 


Ser 


Ala 


Asp 


Val 


Glu Val 


lie 


Glu 


65 










70 










75 








80 


He 


Asp 


Asp 


Thr 


Glu 


Glu 


Ser 


Thr 


Asp 


Pro 


Ser 


Pro 


Asp Gly 


Ser 


Asp 










85 










90 








95 




Glu 


Asn 


Gly 


Asp 


Ala 


Ala 


Ser 


Thr 


Ser 


Val 


Pro 


He 


Glu Glu 


Glu 


Ala 








100 










105 








110 






Arg 


Lys 


Lys 


Asp 


Glu 


Gly 


Ala 


Ser 


Glu 


Val 


Thr 


Val 


Ala Ser 


Ser 


Glu 






115 










120 










125 






He 


Glu 


Gin 


Asp 


Asp 


Asp 


Gly 


Asp 


Val 


Met 


Glu 


He 


Thr Glu 


Glu 


Pro 




130 










135 










140 








Asn 


Gly 


Lys 


Ser 


Glu 


Asp 


Thr 


Ala 


Asn 


Gly 


Thr 


Val 


Thr Glu 


Glu 


Val 


145 










150 










155 








160 


Leu 


Asp 


Glu 


Glu 


Glu 


Pro 


Glu 


Pro 


Ser 


Val 


Asn 


Gly 


Thr Thr 


Glu 


He 










165 










170 








175 




Ala 


Thr 


Glu 


Lys 


Glu 


Pro 


Glu 


Asp 


Ser 


Ser 


Met 


Pro 


Val Glu 


Gin 


Asn 








180 










185 








190 






Gly 


Lys 


Gly 


Val 


Lys 


Arg 


Pro 


Val 


Glu 


Cys 


He 


Glu 


Leu Asp 


Asp 


Asp 






195 










200 










205 






Asp 


Asp 


Asp 


Glu 


He 


Gin 


Glu 


He 


Ser 


Thr 


Pro 


Ala 


Pro Ala 


Lys 


Lys 




210 










215 










220 








Ala 


Lys 


He 


Asp 


Asp 


val 


Lys 


Ala 


Thr 


Ser 


Val 


Pro 


Glu Glu 


Asp 


Asn 


225 










230 










235 








240 


Asn 


Glu 


Gin 


Ala 


Gin 


Lys 


Arg 


Leu 


Leu 


Asp 


Lys 


Leu 


Glu Glu 


Tyr 


Val 
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245 250 255 

Lys Glu Gin Lys Asp Gin Pro Ser Ser Lys Ser Arg Lys Val Leu Asp 

260 265 270 

Thr Leu Leu Gly Ala He Asn Ala Gin Val Gin Lys Glu Pro Leu Ser 

275 280 285 

Val Arg Lys Leu He Leu Asp Lys Val Leu Val Leu Pro Asn Thr He 

290 A 295 300 

Ser Phe Pro Pro Ser Gin Val Cys Asp Leu Leu He Glu His Asp Pro 
305 310 315 320 

Glu Met Pro Leu Thr Lys Val He Asn Arg Met Phe Gly Glu Glu Arg 

325 330 335 

Pro Lys Leu Ser Asp Ser Glu Lys Arg Glu Arg Ala Gin Leu Lys Gin 

340 345 350 

His Asn Pro Val Pro Asn Met Thr Lys Leu Leu Val Asp He Gly Gin 

355 360 365 

Asp Leu Val Gin Glu Ala Thr Tyr Cys Asp lie Val His Ala Lys Asn 

370 375 380 

Leu Pro Glu Val Pro Lys Asn Leu Glu Thr Tyr Lys Gin Val Ala Ala 
385 390 395 400 

Gin Leu Lys Pro Val Trp Glu Thr Leu Lys Arg Lys Asn Glu Pro Tyr 

405 410 415 

Lys Leu' Lys Met His Arg Cys Asp Val Cys Gly Phe Gin Thr Glu Ser 

420 425 430 

Lys Leu Val Met Ser Thr His Lys Glu Asn Leu His Phe Thr Gly Ser 

435 440 445 

Lys Phe Gin Cys Thr Met Cys Lys Glu Thr Asp Thr Ser Glu Gin Arg 

450 455 460 

Met Lys Asp His Tyr Phe Glu Thr His Leu Val He Ala Lys Ser Glu 
465 470 475 480 

Glu Lys Glu Ser Lys Tyr Pro Cys Ala He Cys Glu Glu Asp Phe Asn 

485 490 495 

Phe Lys Gly Val Arg Glu Gin His Tyr Lys Gin Cys Lys Lys Asp Tyr 

500 505 510 

He Arg He Arg Asn He Met Met Pro Lys Gin Asp Asp His Leu Tyr 

515 520 525 

He Asn Arg Trp Leu Trp Glu Arg Pro Gin Leu Asp Pro Ser He Leu 

530 * 535 540 

Gin Gin Gin Gin Gin Ala Ala Leu Gin Gin Ala Gin Gin Lys Lys Gin 
545 550 555 560 

Gin Gin Leu Leu His Gin Gin Gin Ala Ala Gin Ala Ala Ala Ala Ala 

565 570 575 

Gin Leu Leu Arg Lys Gin Gin Leu Gin Gin Gin Gin Gin Gin Gin Gin 

580 585 590 

Ala Arg Leu Arg Glu Gin Gin Gin Ala Ala Gin Phe Arg Gin Val Ala 

595 600 605 

Gin Leu Leu Gin Gin Gin Ser Ala Gin Ala Gin Arg Ala Gin Gin Asn 

610 615 620 

Gin Gly Asn Val Asn His Asn Thr Leu He Ala Ala Met Gin Ala Ser 
625 630 635 640 

Leu Arg Arg Gly Gly Gin Gin Gly Asn Ser Leu Ala Val Ser Gin Leu 

645 650 655 

Leu Gin Lys Gin Met Ala Ala Leu Lys Ser Gin Gin Gly Ala Gin Gin 

660 665 670 

Leu Gin Ala Ala Val Asn Ser Met Arg Ser Gin Asn Ser Gin Lys Thr 

675 680 685 

Pro Thr His Arg Thr Pro Thr Phe Val Cys Glu lie Cys Asp Ala Ser 

690 695 700 

Val Gin Glu Lys Glu Lys Tyr Leu Gin His Leu Gin Thr Thr His Lys 
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705 










710 








715 




720 


Gin 


Met 


Val 


Glv 

J; 


Lys 
725 


Val 


Leu 


Gin 


Asd 


Met Ser Gin Gly 
730 


Ala Pro 
735 


Leu 


Ala 


Cys 


Ser 


Ara 
740 


Cvs 


Arq 


Asp 


Arg 


Phe 
745 


Trp Thr Tyr Glu 


Gly Leu 
750 


Glu 


Arar 


His 


Leu 


Val 


Met 


Ser 


His 


Gly 


Leu 


Val Thr Ala Asp 


Leu Leu 


Leu 




755 










760 




765 






Lys 


Ala 


Gin 


Lvs 


Lvs 


Glu 


Asp 


Gly 


Gly 


Arg Cys Lys Thr* 


Cys Gly 


Lys 


770 










775 






780 






ASH 




Ala 


Phe 


Asn 


Met 


Leu 


Gin 


His 


Leu Val Ala Asp 


His Gin 


Val 


785 








790 








795 




800 


T,vc 
4JJ' o 




Cys 


Ser 


Ala 


Glu 


He 


Met 


Tvr 

«* 


Ser Cys Asp Val 


Cys Ala 


Phe 








805 










810 


815 




Lys 


Cys 


Ser 


Ser 


Tvr 


Gin 


Thr 


Leu 


Glu 


Ala His Leu Thr 


Ser Asn 


His 




820 










825 




830 




Pro 


Lvs 


Gly 
835 


Asp 


Lys 


Lys 


Thr 


Ser 
840 


Thr 


Pro Ala Lys Lys 
845 


Asp Asp 


Cys 


He 


Thr 
850 


Leu 


Asp 


Asp 

















<210> 2 
<211> 4001 
<212> DNA 

<213> Caenorhabditis elegans 
<400> 2 

tcacacactc atgacataca cacatcattt cgcctcacac accgcgccgt cgccatccgc 60 
accgcccggg tgggacgtgt tcaaactttt cggttttcgt aattaatagt gagccccggt 120 
ttattcgctt tgagaatcag tataatggat atatcagatt gtgtaattag gttgcgtgct 180 
tgaactttta aaattaactg ttttaaattt atctgccttt atcgttacag taaatcattt 240 
tgatgaactt ttcggatgaa tcataatgaa gtacgcagcg ctctaacaaa atgtgtttgt 300 
aaattccaat tgctacaagt tgcccggctt attttttggt gattgaagca tgattctgtt 360 
gacgctcccg acgcggaata ccaggacgga ccgatgagag agtactgcca gtgaagagac 420 
gcatgcgagc aggacgagtg ctgctcaccc ttcttctcag cgtcggcggc tgcgaccagc 480 
ggccgaggaa ggggaggaga gaggccgatt tggctgcgta ccacgtttga tactcagtca 540 
cttaccacag ctggttctct tgtgcgttca aatctggctt gccgcgcgcg cgcattttat 600 
tcctaccagt ttgaatctcc cacctctccg actgtaactg tcctaatttg cttccttctc 660 
atcactctct ctttgcctat ttctcactat ctagactcta tttttccaga atggtcaccg 720 
ccgacgagac ggtactcgcc acaacgacca acaccacttc catgtctgtg gaaccaacgg 780 
atccgagaag cgctggtgaa tcgtcctcag attcggagcc agacacaatt gaggtgagga 84 0 
aaagttttgg gaatttaaat ctgaataaaa cgttttcagc agctgaaggc agaacagcgc 900 
gaagtgatgg ccgacgcggc gaatggttcc gaagtcaacg gaaatcaaga gaacggaaaa 960 
gaggaagcgg catctgcaga cgtggaagtg atcgagatag atgacaccga agagtctacg 1020 
gatccctcac ctgatggatc tgatgaaaac ggtgatgctg catctacatc ggttccaatc 1080 
gaagaggaag cgcgtaaaaa ggatgagggg gcttccgaag tgactgtggc atcatctgag 1140 
attgaacaag acgatgatgg cgatgttatg gaaatcactg aggagccgaa cggaaagtcg 1200 
gaggatactg ccaacggaac aggtgtgttt tataatttta ccaagtttaa ttttaacttt 1260 
ctattttcag ttactgagga ggtgctagat gaagaggagc cagaaccttc cgtaaacgga 1320 
acaactgaga tcgctacaga gaaagagcca gaagattctt caatgcctgt cgaacagaat 13 80 
gggaagggtg tgaagcggcc tgtcgaatgc atcgaactcg acgacgacga tgatgacgag 1440 
attcaggaaa tttctacccc tgccccagct aaaaaagcta aaattgatga tgtcaaggcg 1500 
acaagcgttc cagaagagga caacaaigag caggcgcaga agagattgct cgacaagctg 1560 
gaagagtatg tgaaggagca gaaggatcaa ccatccagca aaagccgaaa agttctggac 1620 
actcttctcg gagcaatcaa tgcgcaagtt caaaaggagc ctctgtcggt tcggaagctg 1680 
atcctggaca aagttctcgt tctcccaaac acaatatcat tcccaccaag tcaagtttgc 1740 
gacttattga ttgagcacga tcccgaaatg cctttgacga aggttatcaa caggatgttt 1800 
ggagaagaaa gaccaaagtt gagtgattcc gagaaacgag agagagctca gctgaaacaa 1860 
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cataatcctg ttccaaatat gacaaaactg 
gaagctacct attgtgatat agttcacgcg 
gaaacctata agcaagtcgc tgcgcagttg 
aatgagccgt acaagttgaa aatgcatcga 
aagctggtta tgagcactca caaggagaat 
accatgtgta aagagacgga cacgagtgag 
tttttttttt catctttcaa tattcattta 
aatcggaaga gaaggagtcc aagtatccat 
aaggtgtccg tgagcagcat tacaagcagt 
tcatgatgcc gaagcaagac gatcatctct 
aattggatcc cagcattctt caacagcagc 
agaagcaaca gcaacttctg catcaacagc 
tcttacggaa gcaacaatta caacagcaac 
aacagcaagc ggcccaattc cggcaagtgg 
ctcaacgtgc acagcagaat caaggaaatg 
gctaaacata ttttaaataa gtattttgta 
cgttgcgtag aggtggtcaa caaggaaatt 
aaatggcagc tttgaagtcg caacaaggag 
tgagaagcca gaacagtcaa aagacgccaa 
cgtctcatgc tactgttggc tcttcttcag 
cgtcagtgca ggaaaaggag aagtatctac 
ctatttcaat ttcaaaaccg attattaaat 
taagcagatg gttggaaaag tgctgcagga 
tcgatgccgt gacagattct ggacttatga 
tggtctcgtc actgctgatc tgctcctcaa 
caagacatgc ggcaagaact atgcgttcaa 
agtgaagttg tgctcggctg aaatcatgta 
gagttatcag actctggaag cccatctcac 
atcaacacca gcaaaaaaag atgattgtat 
gcttatcccg ttctacgaat gagtgctgga 
tcttattctt tacattcaat cattttaaat 
cattctattg cgggttccgg aaccgaaatc 
ttctcttcat gatatctggt ttattctcgc 
ttttttcaaa acctaactac cccacaatta 
agacagatca gtatacactt tcacttcata 
ttttaccatt tgtccagtta agatttttgg 

<210> 3 
<211> 2562 
<212> DNA 

<213> Caenorhabditis elegans 
<400> 3 

atggtcaccg ccgacgagac ggtactcgcc 
gaaccaacgg atccgagaag cgctggtgaa 
gagcagctga aggcagaaca gcgcgaagtg 
aacggaaatc aagagaacgg aaaagaggaa 
atagatgaca ccgaagagtc tacggatccc 
gctgcatcta catcggttcc aatcgaagag 
gaagtgactg tggcatcatc tgagattgaa 
actgaggagc cgaacggaaa gtcggaggat 
ctagatgaag aggagccaga accttccgta 
gagccagaag "attcttcaat gcctgtcgaa 
gaatgcatcg aactcgacga cgacgatgat 
ccagctaaaa aagctaaaat tgatgatgtc 
aatgagcagg cgcagaagag attgctcgac 
gatcaaccat ccagcaaaag ccgaaaagtt 
caagttcaaa aggagcctct gtcggttcgg 



ctcgtggaca ttggacagga tctcgttcaa 1920 
aagaatcttc cagaggtgcc aaaaaatctt 1980 
aaaccagttt gggagacatt gaaacgcaaa 2040 
tgcgacgtct gtggattcca gacggaatca 2100 
ttgcacttca caggatccaa attccagtgc 2160 
caaagaatga aggatcacta cttgtaagtt 2220 
attacagcga aactcatctt gttattgcaa 2280 
gtgcaatctg cgaagaagac ttcaatttca 2340 
gcaagaagga ctacattcgc attcgaaaca 2400 
atatcaacag atggctctgg gagaggcccc 2460 
aacaagctgc tcttcagcaa gctcaacaaa 2520 
aagcagcaca agctgcagcc gctgcgcaac 2580 
aacaacagca acaggctcgt cttcgtgagc 264 0 
ctcaactgct gcaacaacaa tcagcgcagg 2700 
tgaatcataa cactctgatt gcaggtaata 2760 
taattattta tatttcagca atgcaagcgt 2820 
cgctggcagt ttctcaactt ctccaaaagc 2880 
ctcaacaact tcaggctgcg gtgaactcca 2 940 
cacacagaag ttcgaaactt gttactacgc 3 000 
ctcccacgtt tgtatgcgaa atttgtgatg 3060 
agcatcttca ggtaatttta agaaacgttt 3120 
atcttaaaca tcacattttc agactactca 3180 
catgtcgcaa ggagctccac tggcatgttc 3240 
agggttggag cggcacttgg tgatgtcgca 33 00 
agcgcaaaag aaggaagacg gaggtcgatg 33 60 
catgcttcaa cacttggtag ctgatcatca 3420 
ctcgtgcgat gtgtgcgcgt tcaaatgctc 3480 
ttcaaatcac ccaaaaggag ataagaagac 3540 
tactctggat gattaatagg aaaacgaatg 3 600 
aacattcttc acaatgatct caattatttc 3660 
caccagttct cccactttca ttgatataca 3720 
aatcagtact ttactttatt tccccaattt 3780 
atcttcccct accttcaaaa ctccctattt 3840 
tcatgtaaaa tcaaattgca attccccata 3900 
cgtctgttgt tctcccccat ctcatacttt 3960 
aagatatcta t 4001 



acaacgacca acaccacttc catgtctgtg 60 
tcgtcctcag attcggagcc agacacaatt 120 
atggccgacg cggcgaatgg ttccgaagtc 180 
gcggcatctg cagacgtgga agtgatcgag 240 
tcacctgatg gatctgatga aaacggtgat 300 
gaagcgcgta aaaaggatga gggggcttcc 360 
caagacgatg atggcgatgt tatggaaatc 420 
actgccaacg gaacagttac tgaggaggtg 480 
aacggaacaa ctgagatcgc tacagagaaa 54 0 
cagaatggga agggtgtgaa gcggcctgtc 600 
gacgagattc aggaaatttc tacccctgcc 660 
aaggcgacaa gcgttccaga agaggacaac 720 
aagctggaag agtatgtgaa ggagcagaag 780 
ctggacactc ttctcggagc aatcaatgcg 84 0 
aagctgatcc tggacaaagt tctcgttctc 900 
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ccaaacacaa tatcattccc accaagtcaa gtttgcgact tattgattga gcacgatccc 960 

gaaatgcctt tgacgaaggt tatcaacagg atgtttggag aagaaagacc aaagttgagt 1020 

gattccgaga aacgagagag agctcagctg aaacaacata atcctgttcc aaatatgaca 1080 

aaactgctcg tggacattgg acaggatctc gttcaagaag ctacctattg tgatatagtt 1140 

cacgcgaaga atcttccaga ggtgccaaaa aatcttgaaa cctataagca agtcgctgcg 1200 

cagttgaaac cagtttggga gacattgaaa cgcaaaaatg agccgtacaa gttgaaaatg 1260 

catcgatgcg acgtctgtgg attccagacg gaatcaaagc tggttatgag cactcacaag 1320 

gagaatttgc acttcacagg atccaaattc cagtgcacca tgtgtaaaga gacggacacg 1380 

agtgagcaaa gaatgaagga tcactacttc gaaactcatc ttgttattgc aaaatcggaa 1440 

gagaaggagt ccaagtatcc atgtgcaatc tgcgaagaag acttcaattt caaaggtgtc 1500 

cgtgagcagc attacaagca gtgcaagaag gactacattc gcattcgaaa catcatgatg 1560 

ccgaagcaag acgatcatct ctatatcaac agatggctct gggagaggcc ccaattggat 1620 

cccagcattc ttcaacagca gcaacaagct gctcttcagc aagctcaaca aaagaagcaa 1680 

cagcaacttc tgcatcaaca gcaagcagca caagctgcag ccgctgcgca actcttacgg 1740 

aagcaacaat tacaacagca acaacaacag caacaggctc gtcttcgtga gcaacagcaa 1800 

gcggcccaat tccggcaagt ggctcaactg ctgcaacaac aatcagcgca ggctcaacgt 1860 

gcacagcaga atcaaggaaa tgtgaatcat aacactctga ttgcagcaat gcaagcgtcg 1920 

ttgcgtagag gtggtcaaca aggaaattcg ctggcagttt ctcaacttct ccaaaagcaa 1980 

atggcagctt tgaagtcgca acaaggagct caacaacttc aggctgcggt gaactccatg 2040 

agaagccaga acagtcaaaa gacgccaaca cacagaactc ccacgtttgt atgcgaaatt 2100 

tgtgatgcgt cagtgcagga aaaggagaag tatctacagc atcttcagac tactcataag 2160 

cagatggttg gaaaagtgct gcaggacatg tcgcaaggag ctccactggc atgttctcga 2220 

tgccgtgaca gattctggac ttatgaaggg ttggagcggc acttggtgat gtcgcatggt 2280 

ctcgtcactg ctgatctgct cctcaaagcg caaaagaagg aagacggagg tcgatgcaag 2340 

acatgcggca agaactatgc gttcaacatg cttcaacact tggtagctga tcatcaagtg 2400 

aagttgtgct cggctgaaat catgtactcg tgcgatgtgt gcgcgttcaa atgctcgagt 2460 

tatcagactc tggaagccca tctcacttca aatcacccaa aaggagataa gaagacatca 2520 

acaccagcaa aaaaagatga ttgtattact ctggatgatt aa 2562 

<210> 4 
<211> 10 
<212> DNA 

<213> Caenorhabditis elegans 



<400> 4 
agtttcagac 

<210> 5 
<211> 10 
<212> DNA 

<213> Caenorhabditis elegans 

<400> 5 
agtttcagac 



10 



10 



<210> 6 
<211> 13 
<212> DNA 

<213> Caenorhabditis elegans 
<400> 6 

agaatcttca gtc 13 

<210> 7 
<211> 13 
<212> DNA 

<213> Caenorhabditis elegans 
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<400> 7 

agaatcttca gcc 13 

<210> 8 
<211> 13 
<212> DNA 

<213> Caenorhabditis elegans 
<400> 8 

agaactttaa gat 13 

<210> 9 
<211> 13 
<212> DNA 

<213> Caenorhabditis elegans 
<400> 9 

agaactttaa gat 13 

<210> 10 
<211> 10 
<212> DNA 

<213> Caenorhabditis elegans 
<400> 10 

agttgcagaa 10 

<210> 11 
<211> 10 
<212> DNA 

<213> Caenorhabditis elegans 
<400> 11 

agttgcagaa 10 

<210> 12 
<211> 16061 
<212> DNA 

<213> Caenorhabditis elegans 
<400> 12 

gaggaagatg tagacgacga ttcggtttcc gtactctcat gacttttggc gaaaatcctc 60 
acgaattctt tttccgtcat acgttgagtt aaaaatctgg cgatgtaacg aagaatgaga 120 
agagcgtttg atgtttgcca taagtagatt ttactgaaat aagaaaaagc tttaattaaa 180 
tataatgatg attttttttt ccaactcact tttcgcattg ttctgatgtt tttagttctg 240 
tggctctgcg aaggaaaagt cgaataaatg cagcgaaatt tcctgttgtt tgtgtattgt 300 
acattagaca ttgaagatga tcatctaaag cagattccaa agcgattcgg gtgtctctaa 360 
acgattataa catttttaaa gcttttgcct aattttaatc cttactcgtc gtcatcatca 420 
aacttgagac tgaaagagag aagtttgttc caaaatgggt cataatcgtc gacaggttcc 480 
aaaccgctga gtttcttcag ataaatattc tcctgtaaga ccgtttcctt ggttataact 540 
gatcccatgt gtctgaaatt tgttattaca ctgttaataa tcataaaaat aaaagaaaaa 600 
gtcaagaaag ggtcaaatat taatcaggtc acatcttttt tattcaataa aatctcctct 660 
ctcgttcgtg gcaatgcacg tgaaatgcgc caacaaccgc gagtgegcca acacacacac 720 
atacgcgtca gcagacaatt cgctctcgtt tgaaatttag ttgtttcttt gtttctgctg 780 
aaataatgtc agttttccga taatttcagc gttttctgac tgatttttct tgttgcattc 840 
acttcctaat agttcattct actccattct tcattttata atctgtttcc ttcgcaattt 900 
agtgaattaa acacgtaaat cttgtttcag ataaattatt caaatagttg cacaaagctc 960 
aatagtttag aagtatcttc agtgctggtc actaatacaa aatggatccg gctatggctt 1020 
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ctccaggcta tcggtctgtg cagtccgatc 
gaattcaaaa tcttgccgat aattcacaaa 
ttagtttcaa taattcgtgt taagtaatca 
aatcgaaaat catttcacac taagttcgca 
gttcctacaa gttttctgca acacaagtcc 
gcttcgaaag ttaatgcttg aaatcattct 
tcatagcaaa gaaattatca agcagatgat 
tgccaatttg gctatcaaaa ttgtcaccga 
ttgcggagag gtttcacaga taatggtctc 
gagtggtcga gctggtgata tgttcaacat 
ctccgacgag caagtcatca ctgaatattt 
tctcaacgga acggaaggaa aaccgccatt 
gtcaacgaag gtgctcctgg aggttccgta 
aacagcgatc caaaccgaag cgcttgattt 
cagagttcca gacgaggata aactcaaaac 
tgcacagtcc cgattcctgt cattcgtcaa 
cgttttttca agtttttttt ctgtaatcct 
catgcaaaat ggaccgcttc tagtgtcggg 
tgatctgata agtgtccgac gagaagttct 
aatgaagtcg aaattctttc caatgctacc 
aacaggattc actgcgattg agcatttgcg 
gttgcatcac atgcgaaatt ctatagacta 
actttctgat gaaaaatgtt gaaatttcag 
cgatcctaac aactcttctc aagtccagat 
cgaatctctg tgcaaaatgg attcacatga 
atctcacttc gaaataagtt tcagactcgt 
gtggccaagc tcaaaactct tgcagtctat 
accgaaatag actacgaata caaaagttat 
atcccaaagg acactatacg aggagtaccg 
tcagttgaag agctggaatt cctggcatca 
agtggtggag atccgaacaa gcttcctccg 
gaagcgattt taaccgccat gtcaacgatg 
cgaaatcttg tgaagtatat aatgcatacg 
gcccggccat cacaggatat gtatcattgt 
ctacgatatg gtgtaatgtg tatggatgta 
caaatgcatt cttcaatgcg gacaaaagat 
gtttttacaa caatcgacca tgcgatattc 
ttgattgaaa gaatttacaa tcggaactat 
gttcgaaatg aagtgccatt cttcgcatct 
aaattgctgg aagttagcaa tgacaagacg 
ttctccgcca tcggagccaa tggctctggg 
ctcccagaga ttctcaaaca gtcaactgtc 
tatttccttt tgcttcgtgc attgttccgc 
tatggaaagt tcctgcagtt actgccaaat 
agtttcattt tttgatatat cggtaataca 
catcggattc aaatgcgtga gctcttcgtc 
agttcccttc tgccatacct accgcttctg 
agtccgaaca tagttacaca aggattgaga 
cctgaatatc ttctcgaaaa tatgcttcct 
cgtgttgtat cgaaagctcc agatacatca 
aagttcggag gagccaatcg aaaacttctg 
ttaggcgacg taagtttatt tagtttattc 
ctattaacag actgttcagt cgtacatcaa 
caatcacagc attcacctgc cactgtccga 
atatccagct gatatgatcc ttaatccaag 
gaaatggtgt atggaattgt cgaaagccgt 
cccaattact ccaagtgcaa atcttccgaa 
tccaaacaat cgtaccactg aagtatacac 



ggagtaatca cctaacagag ctggaaacga 1080 
gagatgatgt caaattgaaa atgttacaag 1140 
atttgttcgg ttgcaggaga tttggagcac 1200 
cgagaaagtc gtggagaggc tcattctctc 1260 
acagttcatt gctgaaaaca atacacaaca 1320 
tcgactttcg aacgtagaag ccatgaaaca 1380 
gaggctaatc accgtggaaa atgaggagaa 1440 
tcaagggaga agtaccggca aaatgcaata 1500 
cttcaaaaca atggtcattg atctgacggc 1560 
aaaagagcat aaagctccac cgtcaactag 1620 
gaagacttgc tactatcaac aaacggttct 1680 
aaaatacaat atgattccat cagctcatca 1740 
tctcgtgatt ttcttctatc aacatttcaa 1800 
catgaggctt ggtcttgatt ttctaaatgt 1860 
aaatcaaata ataaccgatg attttgtcag 1920 
cattatggct aagattccag cggtaagttt 1980 
gatttttatt tttcagttta tggatcttat 2040 
aacaatgcag atgctcgagc ggtgcccggc 2100 
gatggctttg aagtatttca catctggaga 2160 
tcgactcatc gctgaggagg ttgttctggg 2220 
agttttcatg tatcaaatgc tagcagatct 2280 
tgaaatgatc acacagtaag tttgaataag 234 0 
cgtgattttc gtattctgtc gcactcttca 2400 
tatgtctgct cggctgctca actcactggc 2460 
taccgtaaga cttattctat caataatcgt 2520 
gatctgctca ttgaaatcct ggagtcgcac 2580 
cacatgccta ttctcttcca acaatacgga 2640 
gagagagacg ccgagaaacc tggaatgaat 2700 
aaacgaagaa tccgtcggct ctccattgat 2760 
gaaccatcca cgtcggaaga tgcagatgag 2820 
ccaacaaaag agggaaagaa aacgtctccc 2880- 
acacctcctc cattggcaat tgttgaagct 2940 
tgtaaattcg tgacaggaca attgagaatc 3 000 
tcgaaggagc gagatttatt cgaacgtctt 3060 
ttcgtgcttc caacaactcg aaatcaacca 3120 
gagaaagatg ctctggagtc gttggcaaac 3180 
cgggaaatct tcgaaaagta tatggatttc 3240 
ccattgcaat tgatggtgaa caccttcttg 3300 
acgatgcttt cattcttgat gtctcgaatg 3360 
atgctatatg tgaagctctt caaaattatc 3420 
cttcatggag ataaaatgct cacttcatac 3480 
ttggcattaa cagctcgtga acctctcaac 3 540 
agtattggtg gtggcgctca ggatattttg 3600 
cttcttcaat tcttgaataa attgacggtg 3660 
ctaaaaatcc agaatcttca gtcatgtcaa 3720 
gagttgtgtt tgactgtgcc agttcgactc 3780 
atggatccac tggtgtgtgc gatgaatggg 3840 
acattggaat tatgtgtgga taacttgcaa 3900 
gtccgtggag ctttgatgca aggcctctgg 3960 
tcgatgacag cagcgttcag gatcctcgga 4020 
aatcaaccgc aaattcttca agtagccact 4080 
tcttcctcgt tttaagttct aacattgatc 4140 
tatggaattc tcgcggatgg gactcgatgg 4200 
gttgatgaga gtcgttgccg atcagatgag 4260 
tcctgcaatg atcccgtcaa ctcatatgaa 4320 
cttgttagcc ggacttggat cttcaggaag 4380 
gattatcaag aaacttcttg aagattttga 4440 
atgtccgagg gaaagtgatc gagagctttt 4500 
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tgtgaatgca cttctcgcaa tggcttgtaa 
ctatatttta aatttttcag acggaatatg 
caaattcttt atcaaagttc tccgccagtt 
tggaaatgga tggatgcgtc atgcagaaga 
gtctgttatg gttgatgctc tgattatttg 
tgctggtgtc atgtctcttc gtcatatcaa 
tgatcaaatg tcgaaagttc caatgtgcaa 
tcacgggcct gcttggtatg caagatctgg 
atcgtttcca cgaaaatttg ttatggactt 
agttattttg ggaactgttg aagaaatatc 
tctcaagaaa atgatgcgag tctatttcat 
gacactcgcg actatttttg tgtctgcaat 
agtcagagaa tttgcgattg gtttaatgga 
atcccttgat aagttctact atcgattcaa 
gctcacaaca gttccaacaa tgtcattggc 
aaactatatg ttcaactgtc cggatggttt 
gcgatatttg tcacatctgc tggatattgc 
gaatgccttc aaaaaatgcg agacatgccc 
tacacatatt gattcaatgc gagccagtgc 
aatgaagaag caatacatcg acaagggaat 
gatcctcgca cttcgcagct ccaagatcac 
ttggagacga ttgatgacag ttctattgag 
tgcggagaag cttcatcctt cacttttgaa 
aacatttggt gcttcttaca taagaaatat 
tcgtcatatt tcgtacaacg atataatgaa 
gattctggtc acaaaaatgg cagtgaatct 
tgacaagatc tctaggattt tgtcagttcc 
tttcgaagcg gagaagatga aaggaattcg 
aatgcttgct ggatgcccag tgaccacatt 
ttttgctgct cattttgagt atgcttattc 
tgtcacagta atcctcaaca aaagtcccaa 
atcaattcta gatcctgcac gcagatcctt 
tggtccactg cgacaggaat tcatggatac 
tgacgatgag gagaataagg atgaagatga 
attttcgatt gtcgatcgta tctcgaagag 
cccaattcca agaattaaga agttgttctc 
tcgagcattg actgaggtga agaaatttca 
gcacaagtac aaggttccga agctgattct 
tcaagatagt ttgtaaaaat taattacaaa 
aatttttgaa tttcggatca aaaaaattta 
aaaaaattcg aattctcact tctaaaatta 
ttctctatga aattcatgtt ttgggcctat 
taaataactt gcaaatttca ggctcaacat 
ctcgtgtttc aatggcaatt tcgtcaccga 
tgaagtcatc ccgaaagtgc cgttacaatg 
gaagtttgat acggatccac aaactgctgg 
atatttggtt attcccacgt tgcattgggc 
tggcaccgca ccaatagatg attcggattc 
ggataacctt gtggctcgtt taacatcagt 
tggaatggtc attgttttct atcaactttg 
tattcacaat aataactgca agaaacaagg 
ctggccgtgc ctgaccatgt acaatcatca 
cttcttggcc aatattatag agcgtttcac 
ccatcaactt atgactactt atcagcagga 
tatattaact ccagctttga ggacacgaat 
tgtgaagaaa attcttatcg aagaatgcca 
agtttattat ctaaaatgat tttttttaat 
gctcctttaa taattcctga attttccagc 



gttcttaagt tcttttctct ctaatcagat 4560 
gaataaagac ggtttccggc atgtctatag 4620 
tgcgttgatt ggagtactcg aatacattgg 4680 
9gaaggtgtt ctaccattgt gccttgactc 4740 
tctctctgaa acatcgtcaa gcttcatcat 4800 
tgagactctc tcgcttacac ttcccgatat 4860 
atacttgatg gagaaggtgt tcaaattgtg 4920 
tggaatcaat gcaattggat acatgatcga 4980 
tgtgatagat gttgttgatt cgatcatgga 5040 
aagtggatct gctgattctg catacgattg 5100 
caaagaagaa ggccaagaag aggagaatct 5160 
ctctaagcat tacttccaca gtaatgaaag 5220 
tcattgtatg gttcactcaa gacttgcacc 5280 
ggagttcttt gagccagaat taatgcgggt 5340 
agacgcagga ggaagtttgg atggagttca 5400 
tgatttcgaa aaagatatgg acatgtacaa 5460 
acaaaccgat acatttacct taaaccaaag 5520 
atcgcatttc cttcctccat tcccaatcac 5580 
tctacagtgt cttgtgatcg cgtatgatcg 5640 
agagctgggt gatgagcata agatgataga 5700 
agttgatcaa gtctacgaga gcgatgaatc 5760 
agcagtcact gacagagaaa ctcctgaaat 5820 
ggtctcacca atatccacaa tcatcatcgc 5880 
tagtggagca ggagatgaca gtgattcaga 5940 
gttcaagtgt ctcgtggagc tcaatccaaa 6000 
cgcaaatcaa atggttaaat ataagatgag 6060 
cagtagcttc actgaagagg agctcgatga 6120 
agagttggat atgattggtc atacggttaa 6180 
cacggagcaa attattgtgg atatcagtcg 6240 
gcaagatgta cttgtaaatt ggattgatga 6300 
agatgtatgg aagttcttct tgtctcgaga 6360 
tattcgaaga atcatagtct atcaatcaag 6420 
tccggaatat tttgagaaac tcattgatct 6480 
gagaaaaatc tgggatcgtg atatgtttgc 6540 
ctgccctgag tggcttattt ctccgaattc 6600 
cgaaacggaa ttcaatgagc gatatgtggt 6660 
agaagagatc atagtgaaac ggatgacaga 6720 
gaataccttc ctgagatatt tgaggtaatt 6780 
gaaatatacc aaaactgaac cccaaaaaaa 6840 
atattttctc gaaaaatcct tcaaaatacc 6900 
tttttgaatt tttaaataat ttttgaacat 6960 
ttcaggctat aaaaattatt tttctgattt 7020 
ctatgactac gatctattca tcgttatcgc 7080 
tctctctttt cttcgcgaat atcttgaaac 7140 
gcggagagag ctgtttcttc gaattatgca 7200 
aacaagtatg cagcatgtga aggcccttca 7260 
gttcgagcga tatgatacgg atgaaattgt 7320 
ttcgatggat gtagatccgg caggcagctc 7380 
cattgattct catcgtaatt atctgagcga 7440 
cacattgttc gtacaaaacg cctccgaaca 7500 
tggacgccta cggatcctga tgctcttcgc 7560 
agatccaaca atgcggtaca ctggattctt 7620 
aattaatcgg aaaatcgtgc ttcaagtgtt 7680 
cactagagat caaatccgga aagccattga 7740 
ggaagatgga cacttgcaaa tattgagtca 7800 
taatttgcaa catgttcagc atgttttgta 7860 
gttaaaaatt taattttaaa atgcgttcgt 7920 
caaatggtgg ttcgcaatta tcgtgtctac 7980 
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tatcatgttc gattggagct tctcacgcct 
atgccaaata gtgttctgga aaagtaagtt 
gtaaatattt agctggcaaa ctcgacgtca 
gtgggaattg ttcagaacgc tgaaaacaga 
agttgacaag caattggata agctgcgaac 
ggaggctcat aacaagagag acatgcctga 
cgatgtgatt gtcaatatgc ttgtccgatt 
ttcgtccact tctcaaagtg ggaaccatgg 
tctacgtgca gccctacgac caagcatgtg 
gatcgaaaag tttttgtcaa ttccgaatga 
ggcctacgct aatactatcc aaaatgcaca 
tcctgttatg ccaaaaacta gcttgatgac 
acaatgtctc aataacggag ctcaggtatg 
actaatttct cttagaactt taagatgact 
ctcgaaaaga caaatgtttc ggttaacggg 
atttcccgat tcctacatga acattttgga 
ttccatcttt caacccttcg ccagttgcag 
cggagcattt tctcttttgc gaacaatttg 
gatgccttca tttgtaaaag tgatggagag 
gaactcgcaa gatggaaata tggtgaagag 
atccccttag atttctttcc agatgttgct 
cgtcccagag tcgatcatat cagtatggag 
gcggagctga ttatcaaatc gaatcacgat 
ggagcaatga ttagcacgca ggatatggaa 
gttcgtatcc aatcaattat tgtgaccaag 
tatcttgttg tggttattac cgtttttgag 
tctcgtctct gggaaggatt cttctgggga 
aaattctcga tagtttggga gaagacttgg 
cgaatgaaat atatcatgca aaatcaagat 
aaattcgcac tttggggaat gctacgaacg 
aagagaaaga aagtgatact gttgaactgt 
gcgaaattga aggatcagcc aatggaagtg 
ccgatggaag ttgacgaaaa agactcgcaa 
gagaaggaaa agctcacatt ggaattattg 
gcttccaatt atgattttgc ggatgctcta 
aatggtaaat tgttcaaagt ttatgaatat 
caagtgacaa gcaagatgtg ggtagtgttg 
tccgaaatcg aagatttcac ggcgctagtc 
aattatcaga cgggtgtaca ggatagtgtg 
gctgttcatt tgccgtccag attgattgag 
ttcgattttt ctgtttaaaa aaagttaaaa 
gaaaacattt attttgaaaa aagagtcctc 
aaagaaaaaa aactaaaaac ttcaattttt 
cgaggtaaaa aattttaaaa aagttctgta 
cttttctgaa aaaaatttga aaaattaaaa 
aaatatacga aaaaagactg aagaactttt 
aaaaagttca aaaacttgga gaaatcatcg 
aaaaaaaaaa attgaacatt tatgattttt 
taatttttag attaaaaaaa tcaaaaaaaa 
tgaaacgttt tttttttttg caataataaa 
cgaatgctgg cataccggaa tcaggcttct 
actcaacaac acgttactcc gagaaatgaa 
gacactcgaa tct.c.ttggaa cactctacaa 
aatctgggaa cgccgtgctg tatttcctga 
gggagatatg gaattagctc aatcttatct 
tcttgctccg acaatcaatc gtaagtttgg 
gtattccttt cagcaaacaa cacttcaaat 
aaagaatacg atcattggat ggagatgtac 



cttctgaacg gagttcaacg agcacttgtg 8040 
tccagcccgt tgttcgtaaa ctcacccctt 8100 
tgcggtggag atctgcgaga tggtcatcaa 8160 
tcatattatc agtgacgaag aagctctcga 8220 
agcttcatcc acagatcgtt tcgatttcga 8280 
tgctcaacgc acgattatca aagagcacgc 8340 
ctgtatgacg ttccatcaga attcgggttc 8400 
tgtcgagttg accaaaaaat gtcagctgct 8460 
gggagaattt gtcagcttcc gattaacaat 8520 
taatgctcta cgcaatgata taagttctac 8580 
acacactctg gatatgctgt gtaatattat 8640 
tatgatgaga caactccaac ggccactcat 8700 
tgaagaacga tgaatagggg gttataaatc 8760 
cgtcttgtca ctcaaattgt cagtcggtta 8820 
cttgatgagc tggagcaatt gaatcaatac 8880 
tctcttttga agtaagtttt atttttgaat 8940 
aaacttgagt ggaccagtgt tgggagttct 9000 
tggacacgag ccagcatact tggatcattt 9060 
agctgcaaaa gagcacttgg cgtatgttgc 9120 
taagttctat aaaaagattc agattttcta 9180 
gaattgttgt gtgcatgcat ggagctggta 9240 
attaagagat caattgttgg tggtattatc 9300 
aagatcatcc agacgtcagt gaagcttctc 9360 
tttacaattc tcactgttct tccgctactt 9420 
ttcaagaatt gcaaggatct gatagcagac 9480 
aacagcgaat atcggaactc ggaagctgga 9540 
ctcaagagta gcgatcctca aacccgggag 9600 
ccacacatgg caacagtaga tattgctcat 9660 
tggtccaagt tcaaacacgc gttttggttg 9720 
attgccaaac ggccaactga tccgaataat 9780 
gcaactccat ggagaacaat tgaatatgca 9840 
gaaactgaaa tgaaacgaga agagccagaa 9900 
gatgattcta aggatgccgg agagcccaag 9960 
cttgctggac aacaagaact tttggatgaa 10020 
gatacagtat cccagattac atttgcactt 10080 
ttttcttaaa aatcacaatt ttcagagaat 10140 
ttcaaatcat tctggagttc cttatcacaa 10200 
gttccgttta tgagcagtgg agtgcataat 10260 
cttgctgttt ggcttgaagc tgttggtgac 10320 
gtacgttctg aaaatgaatg ctggaaaaaa 10380 
tttccgattt tttgaatagc aaaaaaaaaa 10440 
accggaattt tttaataaat aaatttaaaa 10500 
gaaaatcaaa aaaaaaatta cagaaacaga 10560 
aaaaaaatgg agaatcacag ttttcgttgt 10620 
attaacgatt ttttggtttt taatttaaaa 10680 
tttgtcaaaa aaacttgatt ttgatgaggg 10740 
gaaattttag aagattcaat aaaaatttcc 10800 
gggtattttg aaaaattgaa aaattacgct 10860 
accaacactc cttttgaaac ttgacacttt 10920 
tttctcattt cagtttatct catcaaaaca 10980 
cgagaatcat atatggacaa ttccaaagca 11040 
agtggcacca ggtctcgctg gagatattga 11100 
tgagatatca gagtttgatc- agttcgctgc 11160 
tacgatgaga gcaatgtcag ctatgcaatt 11220 
ggaaaaatca atgagcagta cgtatgaaac 11280 
atcaatcggt tgtacttctc acacaaaata 11340 
tcggagaagc atgtttctcc gattattgac 11400 
atcacaaatt gctcggagct tcttcagtgg 11460 
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caaaatgtgg ccgacgtatg caatggcaaa 
gcagcatctc acattccgga ctggaatgtg 
tgtattccac caagtttcca tttagattac 
gttagtttaa gtcaaaaagt gatatataat 
aactcaagcc cgacacatat gaaggaacga 
gctcatatta gtcgttggag agcacttccg 
cttcaggcaa tgaacttggt tcgagaaatt 
ctcgaggccc catcaaacaa agtggatcag 
aaagtattcc gaaatagaac accaaccact 
tatgattgga ggaatcagat tcatggaatg 
gtaggactca acgtcgctgc aactggaaac 
caagcacagt tggccgtagc caaacatgcc 
gatctactca acaaattagc tggattgaca 
gtttgcactt acggcaagac acttcgcgat 
aaaaatgagc tattgtgtga agcgcttgaa 
cagaaggatc aggttgctgc attgctttat 
cagtaagttt tcaatgccga aaaaaaatta 
aaatgctgac tacaccttct ccgcagcctc 
aaccactgga atcaagctca tgaaaaattg 
tacgacagtt tgcaaggaaa ccggaaacaa 
cattgcggct cgtgtggata acgatatcaa 
gctctcgaag cacttgaatg cgtgtggatc 
gcaacttcat tcacttaatc tcttcaattg 
tgttcgatat aaaccaaatt cgaactttgt 
atttaaatat tttcagaatt ttaaatgaaa 
ttcaagtatt ttaccacatt cgggaggcag 
aagaagatta cactgatgag caaatgtcga 
acgatccacc atttgataga attctgaaaa 
gagtcttcca tcgtgtcctc aaagaacttg 
acttgcgtca tgcgatctgc ctcaaggatc 
acgcgacgtt caatgagatg caatattcgg 
ggaaacagct ggaagaagac ttggtgtatt 
agattcgtaa caagcgaaag atgatcgtga 
agataatgtt cgaaaaagag ctgagtcaag 
aatttgattt tgtcacaaat atgactaata 
tcgatgctcc acgccctcag ggatatattc 
gtcgtcgttt cgatcgactt ccacgaagaa 
ccagattcag ccatcgtaca ggatgcatcg 
gcgccaagaa tcatactctg atggcttcca 
ctcgatttga gccaaacttt gagattgtga 
atattcgagg acaaaccgga aagagtgcgg 
agccaactaa ccgagttcca caaatgttca 
gagagtcggc gagaagacat cttcatgctc 
agacgacact ctacgaagtt gcatccgttc 
gaaactatcc agcatcacaa atcgacattg 
tcaatggaag ttattatccg gatgatatgg 
gttcttcatc catcggacaa cctcttccaa 
cgccacgact aacggaagct caccacatca 
ctagtgtcta aaataataat taatgtaaaa 
tcccattccg acttctctac gactacctca 
atgcaatgaa gaagcaattg ctgcacagtc 
gcaatctgac accaatggga cctgatcaaa 
gcaatccttc atatagattc gaaatccgag 
ttggacatga agttccattc cgattgactc 
aggatggtga cttgttatgg agtatggctg 
ctgaagttat catgagaccg ttagtatggg 
aatcggtaat tttactttaa tatgctaata 
gcaggtattc gcgtgtcatg catcgaattc 
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gacatgcaac atgttcgtgg cctgatcaac 11520 
gtcgaggagt gtaaaagtca gatagctgga 11580 
actcttttca atttgatgag tactgttatg 11640 
tattgtttaa tttttcagcg aatgaatgaa 11700 
tgcaaaattg caattcaaga gtgcacagaa 11760 
tcagttgttt catatggtca tgtcaagatt 11820 
gaagagtcta cagatattcg cattgctctg 11880 
gcgttgatgg gcgatatgaa gtcgttgatg 11940 
tcggatgata tgggattcgt ttcgacttgg 12000 
atgcttcaaa gattcgaata ttgggataaa 12060 
cagtcaattg ttccgattca ttcaatggct 12120 
aagaatcttg gattccataa tttaacgaaa 12180 
gccataccga tgatggatgc tcaagataaa 12240 
atggcaaaca gtgcggctga cgaaagagtg 12300 
gttttggaag atgtgcgaat tgatgatcta 12360 
catcgtgcta atattcattc agttcttgat 12420 
aagttttaca aaaataaatt tcagagctga 12480 
tcaacttgtc gacttgcaaa atagtgtgac 12540 
gggccaccat ctttacaaga gattcttctc 12600 
cttcggacgg caggctctcg cttgttactt 12660 
ggcgagaaaa ccgattgcca agattttgtg 12720 
acatgaagtg atgaatcggg ttattaagaa 12780 
gctttactgg cttccacaat tggttactga 12840 
tctgattctc tgcaaggtaa gttttgaaat 12900 
ttcatttgca gatggctgct gctcatccac 12960 
ttagcgttga cgatattgac tcggttctcg 13020 
tggatgtttc ggatgaggat tgttttgcag 13080 
tatgtctgaa atatcgtcca actgatattc 13140 
acgagatgaa tgagacatgg gttgaacgtc 13200 
agatgttcaa agatttctcg gaacaaatgg 13260 
aggatgtgac tatgatgacg ttgagatgga 13320 
tccaacagaa ttataatctt gatttcctgg 13380 
cgaagggatg tatgggagtc gagaaaagtc 13440 
tgttcacaga gccggccggc atgcaagatg 13500 
tgatggtctc acagttggat attcatgcag 13560 
gtattgttct cgactggatt cgagcgattc 13620 
tccctctgga atcgtcaagc ccatatctcg 13680 
aaatgccata cgatttgctc aacgttttgc 13740 
atcaaacggg gcaatacata tccatgctct 13800 
tcaaaggtgg tcaagtgata agaaagatct 13860 
cgttttatct gaagaaatct gtgcaggatg 13920 
aacatcttga tcacgttcta caaaccgata 13980 
caacagtgct gcagatgaga gtcggacaga 14040 
aaccatatgc aatgccaccg gattgtacca 14100 
ttcatccata, tgatgtgctg actgccactt 14160 
tattgcactt ctttgagaga ttcgcccaaa 14220 
ctccgacgaa ccaagatgga acagttgctc 14280 
agaatattat ttatgagtac gtttgagaag 14340 
aaattttcag agactttgcc cgagatatga 14400 
ctgcacgata tcctgatccg gttatgtact 14460 
tcgccgtcct atccacaatc gaatatcatt 14520 
tgatgatgac aatgaatact ggagtcctta 14580 
gaggacgatc acttcatgat attcaacact 14640 
caaatctatc gattttggtt ggtgttgcac 14700 
ctgcgtcaaa atgtttgatg aagaaggaac 14760 
atgaattcgc caacaataca gattgcgaca 14820 
gggaattgaa ctaatgtttt ccaagcgttt 14880 
ttacatcaat ggtgtcgcga gcaagcttcg 14940 
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aaacacgaat agcgccgacg ccaaactcag aaaggacgat tgtgtgtcgc tgatcagtcg 15000 
agccaaggat tcggataatc tggcccgaat gccacccacc taccacgcgt ggttctagat 15060 
ctcataatta ccgttctcta ttttgatccc gcctcccact ctcacagatc tctatacatt 15120 
tgtcaaatgt ttccaaatct tttatctgcc catacattcg tttttattgt tttgtttctt 15180 
ttctttcttt atttcttttc taaactttaa gatttatgta aatatttaac tgcgctggta 15240 
tttatgaaaa attcagataa agttttcaag tttaaaaaat cgaaaattcg aagtcggaag 15300 
ttctcttaca ggtgtagtaa gtaggcacaa tggcaatagg tacatggaag gcttgcggaa 15360 
ggcacatggg taggcataag atcgaaaaat aagctgatat ataaatatag ataggtattg 15420 
gttaggcaca aattaggcac gtaggtgtga gctggcaaat aggtaggcat gacgttcggc 154 80 
aaatcggcaa attgccgatt tggcgaaaat tttcaaatcc ggcgatttgc cggaaatgtt 15540 
tagagaaatt ttttataaga cagaaaaact tacaactgtg tctttttgaa attcttccgg 15600 
ttttctttat acagtgcgtg caacttctat agcgcccccc cccccccccc ccccccctat 15660 
tttttcgcgt ttcacgccat tctgattttt atttttctga tttttttttt tttgcactga 15720 
aacttggcat tgaggatgct tggagagaaa tatcagccag caaaataaag aatctggtca 15780 
actcaatgtc gaatagattt tttgaggtta tcgttaagaa gggaggtccc acgacgtatt 15840 
gatccttcat cgagttaaca aattatgatg ttttaattga tttcattcca cttctggaca 15900 
cagaaggacg aatagtgcaa tctggtacaa gtttatcacc acctacaact tcgtcgattt 15960 
gtggaaaatc tttcagacat gtctccatga gtgtctcaga acatcttggt caggtttgga 16020 
gtcgatccca ccgctgggag ccgagaatgg gcctctaaca c 16061 



<210> 13 
<211> 12195 
<212> DNA 

<213> Caenorhabditis elegans 
<400> 13 

atggatccgg ctatggcttc tccaggctat 
ctaacagagc tggaaacgag aattcaaaat 
aaattgaaaa tgttacaaga gatttggagc 
cacgagaaag tcgtggagag gctcattctc 
ccacagttca ttgctgaaaa caatacacaa 
cttcgacttt cgaacgtaga agccatgaaa 
atgaggctaa tcaccgtgga aaatgaggag 
gatcaaggga gaagtaccgg caaaatgcaa 
tccttcaaaa caatggtcat tgatctgacg 
ataaaagagc ataaagctcc accgtcaact 
ttgaagactt gctactatca acaaacggtt 
ttaaaataca atatgattcc atcagctcat 
tatctcgtga ttttcttcta tcaacatttc 
ttcatgaggc ttggtcttga ttttctaaat 
acaaatcaaa taataaccga tgattttgtc 
aacattatgg ctaagattcc agcgtttatg 
gtgtcgggaa caatgcagat gctcgagcgg 
gaagttctga tggctttgaa gtatttcaca 
atgctacctc gactcatcgc tgaggaggtt 
catttgcgag ttttcatgta tcaaatgcta 
atagactatg aaatgatcac acacgtgatt 
aacaactctt ctcaagtcca gattatgtct 
ctgtgcaaaa tggattcaca tgataccttt 
gagtcgcacg tggccaagct caaaactctt 
caatacggaa ccgaaataga ctacgaatac 
ggaatgaata tcccaaagga cactatacga 
tccattgatt cagttgaaga gctggaattc 
gcagatgaga gtggtggaga tccgaacaag 
acgtctcccg aagcgatttt aaccgccatg 
gttgaagctc gaaatcttgt gaagtatata 



cggtctgtgc agtccgatcg gagtaatcac 60 
cttgccgata attcacaaag agatgatgtc 120 
acaatcgaaa atcatttcac actaagttcg 180 
tcgttcctac aagttttctg caacacaagt 240 
cagcttcgaa agttaatgct tgaaatcatt 300 
catcatagca aagaaattat caagcagatg 360 
aatgccaatt tggctatcaa aattgtcacc 420 
tattgcggag aggtttcaca gataatggtc 480 
gcgagtggtc gagctggtga tatgttcaac 540 
agctccgacg agcaagtcat cactgaatat 600 
cttctcaacg gaacggaagg aaaaccgcca 660 
cagtcaacga aggtgctcct ggaggttccg 720 
aaaacagcga tccaaaccga agcgcttgat 780 
gtcagagttc cagacgagga taaactcaaa 840 
agtgcacagt cccgattcct gtcattcgtc 900 
gatcttatca tgcaaaatgg accgcttcta 960 
tgcccggctg atctgataag tgtccgacga 1020 
tctggagaaa tgaagtcgaa attctttcca 1080 
gttctgggaa caggattcac tgcgattgag 1140 
gcagatctgt tgcatcacat gcgaaattct 1200 
ttcgtattct gtcgcactct tcacgatcct 1260 
gctcggctgc tcaactcact ggccgaatct 1320 
cagactcgtg atctgctcat tgaaatcctg 1380 
gcagtctatc acatgcctat tctcttccaa 1440 
aaaagttatg agagagacgc cgagaaacct 1500 
ggagtaccga aacgaagaat ccgtcggctc 1560 
ctggcatcag aaccatccac gtcggaagat 1620 
cttcctccgc caacaaaaga gggaaagaaa 1680 
tcaacgatga cacctcctcc attggcaatt 1740 
atgcatacgt gtaaattcgt gacaggacaa 1800 
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ttgagaatcg cccggccatc acaggatatg 
gaacgtcttc tacgatatgg tgtaatgtgt 
aatcaaccac aaatgcattc ttcaatgcgg 
ttggcaaacg tttttacaac aatcgaccat 
atggatttct tgattgaaag aatttacaat 
accttcttgg ttcgaaatga agtgccattc 
tctcgaatga aattgctgga agttagcaat 
aaaattatct tctccgccat cggagccaat 
acttcatacc tcccagagat tctcaaacag 
cctctcaact atttcctttt gcttcgtgca 
gatattttgt atggaaagtt cctgcagtta 
ttgacgaatc ttcagtcatg tcaacatcgg 
tgtttgactg tgccagttcg actcagttcc 
ccactggtgt gtgcgatgaa tgggagtccg 
gaattatgtg tggataactt gcaacctgaa 
ggagctttga tgcaaggcct ctggcgtgtt 
acagcagcgt tcaggatcct cggaaagttc 
ccgcaaattc ttcaagtagc cactttaggc 
ttctcgcgga tgggactcga tggcaatcac 
a g a gtcgttg ccgatcagat gagatatcca 
atgatcccgt caactcatat gaagaaatgg 
gccggacttg gatcttcagg aagcccaatt 
aagaaacttc ttgaagattt tgatccaaac 
a ggg aaa 9tg atcgagagct ttttgtgaat 
aataaagacg gtttccggca tgtctatagc 
gcgttgattg gagtactcga atacattggt 
gaaggtgttc taccattgtg ccttgactcg 
ctctctgaaa catcgtcaag cttcatcatt 
gagactctct cgcttacact tcccgatatt 
tacttgatgg agaaggtgtt caaattgtgt 
ggaatcaatg caattggata catgatcgaa 
gtgatagatg ttgttgattc gatcatggaa 
agtggatctg ctgattctgc atacgattgt 
aaagaagaag. gccaagaaga ggagaatctg 
tctaagcatt acttccacag taatgaaaga 
cattgtatgg ttcactcaag acttgcacca 
gagttctttg agccagaatt aatgcgggtg 
gacgcaggag gaagtttgga tggagttcaa 
gatttcgaaa aagatatgga catgtacaag 
caaaccgata catttacctt aaaccaaagg 
tcgcatttcc ttcctccatt cccaatcact 
ctacagtgtc ttgtgatcgc gtatgatcga 
gagctgggtg atgagcataa gatgatagag 
gttgatcaag tctacgagag cgatgaatct 
gcagtcactg acagagaaac tcctgaaatt 
gtctcaccaa tatccacaat catcatcgca 
agtggagcag gagatgacag tgattcagat 
ttcaagtgtc tcgtggagct caatccaaag 
gcaaatcaaa tggttaaata taagatgagt 
a gtagcttca ctgaagagga gctcgatgat 
gagttggata tgattggtca tacggttaaa 
acggagcaaa ttattgtgga tatcagtcgt 
caagatgtac ttgtaaattg gattgatgat 
gatgtatgga agttcttctt gtctcgagaa 
attcgaagaa tcatagtcta tcaatcaagt 
ccggaatatt ttgagaaact cattgatctt 
agaaaaatct gggatcgtga tatgtttgca 
tgccctgagt ggcttatttc tccgaattcc 



tatcattgtt cgaaggagcg agatttattc 1860 
atggatgtat tcgtgcttcc aacaactcga 1920 
acaaaagatg agaaagatgc tctggagtcg 1980 
gcgatattcc gggaaatctt cgaaaagtat 2040 
cggaactatc cattgcaatt gatggtgaac 2100 
ttcgcatcta cgatgctttc attcttgatg 2160 
gacaagacga tgctatatgt gaagctcttc 2220 
ggctctgggc ttcatggaga taaaatgctc 2280 
tcaactgtct tggcattaac agctcgtgaa 2340 
ttgttccgca gtattggtgg tggcgctcag 2400 
ctgccaaatc ttcttcaatt cttgaataaa 2460 
attcaaatgc gtgagctctt cgtcgagttg 2520 
cttctgccat acctaccgct tctgatggat 2580 
aacatagtta cacaaggatt gagaacattg 2640 
tatcttctcg aaaatatgct tcctgtccgt 2700 
gtatcgaaag ctccagatac atcatcgatg 2760 
ggaggagcca atcgaaaact tctgaatcaa 2820 
gacactgttc agtcgtacat caatatggaa 2880 
agcattcacc tgccactgtc cgagttgatg 2940 
gctgatatga tccttaatcc aagtcctgca 3000 
tgtatggaat tgtcgaaagc cgtcttgtta 3060 
actccaagtg caaatcttcc gaagattatc 3120 
aatcgtacca ctgaagtata cacatgtccg 3180 
gcacttctcg caatggctta cggaatatgg 3240 
aaattcttta tcaaagttct ccgccagttt 3300 
ggaaatggat ggatgcgtca tgcagaagag 3360 
tctgttatgg ttgatgctct gattatttgt 3420 
gctggtgtca tgtctcttcg tcatatcaat 3480 
gatcaaatgt cgaaagttcc aatgtgcaaa 3540 
cacgggcctg cttggtatgc aagatctggt 3600 
tcgtttccac gaaaatttgt tatggacttt 3660 
gttattttgg gaactgttga agaaatatca 3720 
ctcaagaaaa tgatgcgagt ctatttcatc 3780 
acactcgcga ctatttttgt gtctgcaatc 3840 
gtcagagaat ttgcgattgg tttaatggat 3900 
tcccttgata agttctacta tcgattcaag 3960 
ctcacaacag ttccaacaat gtcattggca 4020 
aactatatgt tcaactgtcc ggatggtttt 4080 
cgatatttgt cacatctgct ggatattgca 4140 
aatgccttca aaaaatgcga gacatgccca 4200 
acacatattg attcaatgcg agccagtgct 4260 
atgaagaagc aatacatcga caagggaata 4320 
atcctcgcac ttcgcagctc caagatcaca 4380 
tggagacgat tgatgacagt tctattgaga 4440 
gcggagaagc ttcatccttc acttttgaag 4500 
acatttggtg cttcttacat aagaaatatt 4560 
cgtcatattt cgtacaacga tataatgaag 4620 
attctggtca caaaaatggc agtgaatctc 4680 
gacaagatct ctaggatttt gtcagttccc 4740 
ttcgaagcgg agaagatgaa aggaattcga 4800 
atgcttgctg gatgcccagt gaccacattc 4860 
tttgctgctc attttgagta tgcttattcg 4920 
gtcacagtaa .tcctcaacaa aagtcccaaa 4980 
tcaattctag atcctgcacg cagatccttt 5040 
ggtccactgc gacaggaatt catggatact 5100 
gacgatgagg agaataagga tgaagatgag 5160 
ttttcgattg tcgatcgtat ctcgaagagc 5220 
ccaattccaa gaattaagaa gttgttctcc 5280 
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gaaacggaat tcaatgagcg atatgtggtt 
gaagagatca tagtgaaacg gatgacagag 
aataccttcc tgagatattt gaggctcaac 
gcctcgtgtt tcaatggcaa tttcgtcacc 
actgaagtca tcccgaaagt gccgttacaa 
cagaagtttg atacggatcc acaaactgct 
caatatttgg ttattcccac gttgcattgg 
gttggcaccg caccaataga tgattcggat 
tcggataacc ttgtggctcg tttaacatca 
gatggaatgg tcattgtttt ctatcaactt 
catattcaca ataataactg caagaaacaa 
gcctggccgt gcctgaccat gtacaatcat 
ttcttcttgg ccaatattat agagcgtttc 
ttccatcaac ttatgactac ttatcagcag 
gatatattaa ctccagcttt gaggacacga 
catgtgaaga aaattcttat cgaagaatgc 
caaatggtgg ttcgcaatta tcgtgtctac 
cttctgaacg gagttcaacg agcacttgtg 
tggcaaactc gacgtcatgc ggtggagatc 
agaacgctga aaacagatca tattatcagt 
ttggataagc tgcgaacagc ttcatccaca 
aagagagaca tgcctgatgc tcaacgcacg 
aatatgcttg tccgattctg tatgacgttc 
caaagtggga accatggtgt cgagttgacc 
ctacgaccaa gcatgtgggg agaatttgtc 
ttgtcaattc cgaatgataa tgctctacgc 
actatccaaa atgcacaaca cactctggat 
aaaactagct tgatgactat gatgagacaa 
aacggagctc agaactttaa gatgactcgt 
gaaaagacaa atgtttcggt taacgggctt 
tcccgattcc tacatgaaca ttttggatct 
gtgttgggag ttctcggagc attttctctt 
tacttggatc atttgatgcc ttcatttgta 
ttggcgtatg ttgcgaactc gcaagatgga 
gctgaattgt tgtgtgcatg catggagctg 
gagattaaga gatcaattgt tggtggtatt 
gataagatca tccagacgtc agtgaagctt 
gaatttacaa ttctcactgt tcttccgcta 
aagttcaaga attgcaagga tctgatagca 
gagaacagcg aatatcggaa ctcggaagct 
ggactcaaga gtagcgatcc tcaaacccgg 
tggccacaca tggcaacagt agatattgct 
gattggtcca agttcaaaca cgcgttttgg 
acgattgcca aacggccaac tgatccgaat 
tgtgcaactc catggagaac aattgaatat 
gtggaaactg aaatgaaacg agaagagcca 
caagatgatt ctaaggatgc cggagagccc 
ttgcttgctg gacaacaaga acttttggat 
ctagatacag tatcccagat tacatttgca 
tgggtagtgt tgttcaaatc attctggagt 
acggcgctag tcgttccgtt tatgagcagt 
caggatagtg tgcttgctgt t.tggcttgaa 
agattgattg agtttatctc atcaaaacac 
gagaatcata tatggacaat tccaaagcaa 
gtggcaccag gtctcgctgg agatattgag 
gagatatcag agtttgatca gttcgctgca 
acgatgagag caatgtcagc tatgcaattg 
gaaaaatcaa tgagcagtac gtatgaaact 



cgagcattga ctgaggtgaa gaaatttcaa 5340 
cacaagtaca aggttccgaa gctgattctg 5400 
atctatgact acgatctatt catcgttatc 5460 
gatctctctt ttcttcgcga atatcttgaa 5520 
tggcggagag agctgtttct tcgaattatg 5580 
ggaacaagta tgcagcatgt gaaggccctt 5640 
gcgttcgagc gatatgatac ggatgaaatt 5700 
tcttcgatgg atgtagatcc ggcaggcagc 5760 
gtcattgatt ctcatcgtaa ttatctgagc 5820 
tgcacattgt tcgtacaaaa cgcctccgaa 5880 
ggtggacgcc tacggatcct gatgctcttc 5940 
caagatccaa caatgcggta cactggattc 6000 
acaattaatc ggaaaatcgt gcttcaagtg 6060 
gacactagag atcaaatccg gaaagccatt 6120 
atggaagatg gacacttgca aatattgagt 6180 
cataatttgc aacatgttca gcatgttttc 6240 
tatcatgttc gattggagct tctcacgcct 6300 
atgccaaata gtgttctgga aaaatttagc 6360 
tgcgagatgg tcatcaagtg ggaattgttc 6420 
gacgaagaag ctctcgaagt tgacaagcaa 6480 
gatcgtttcg atttcgagga ggctcataac 6540 
attatcaaag agcacgccga tgtgattgtc 6600 
catcagaatt cgggttcttc gtccacttct 6660 
aaaaaatgtc agctgcttct acgtgcagcc 6720 
agcttccgat taacaatgat cgaaaagttt 6780 
aatgatataa gttctacggc ctacgctaat 6840 
atgctgtgta atattattcc tgttatgcca 6900 
ctccaacggc cactcataca atgtctcaat 6960 
cttgtcactc aaattgtcag tcggttactc 7020 
gatgagctgg agcaattgaa tcaatacatt 7080 
cttttgaatt gcagaaactt gagtggacca 7140 
ttgcgaacaa tttgtggaca cgagccagca 7200 
aaagtgatgg agagagctgc aaaagagcac 7260 
aatatggtga agaatttctt tccagatgtt 7320 
gtacgtccca gagtcgatca tatcagtatg 7380 
atcgcggagc tgattatcaa atcgaatcac 7440 
ctcggagcaa tgattagcac gcaggatatg 7500 
cttgttcgta tccaatcaat tattgtgacc 7560 
gactatcttg ttgtggttat taccgttttt 7620 
ggatctcgtc tctgggaagg attcttctgg 7680 
gagaaattct cgatagtttg ggagaagact 7740 
catcgaatga aatatatcat gcaaaatcaa 7800 
ttgaaattcg cactttgggg. aatgctacga 7860 
aataagagaa agaaagtgat actgttgaac 7920 
gcagcgaaat tgaaggatca gccaatggaa 7980 
gaaccgatgg aagttgacga aaaagactcg 8040 
aaggagaagg aaaagctcac attggaatta 8100 
gaagcttcca attatgattt tgcggatgct 8160 
cttaatgaga atcaagtgac aagcaagatg 8220 
tccttatcac aatccgaaat cgaagatttc 8280 
ggagtgcata ataattatca gacgggtgta 8340 
gctgttggtg acgctgttca tttgccgtcc 8400 
gaatgctggc ataccggaat caggcttctc 8460 
ctcaacaaca cgttactccg agaaatgaaa 8520 
acactcgaat ctcttggaac actctacaat 8580 
atctgggaac gccgtgctgt atttcctgat 8640 
ggagatatgg aattagctca atcttatctg 8700 
cttgctccga caatcaatcc aaacaacact 8760 
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tcaaattcgg agaagcatgt ttctccgatt 
atgtacatca caaattgctc ggagcttctt 
ggcaaagaca tgcaacatgt tcgtggcctg 
aatgtggtcg aggagtgtaa aagtcagata 
gattacactc ttttcaattt gatgagtact 
acacatatga aggaacgatg caaaattgca 
cgttggagag cacttccgtc agttgtttca 
aacttggttc gagaaattga agagtctaca 
tcaaacaaag tggatcaggc gttgatgggc 
aatagaacac caaccacttc ggatgatatg 
aatcagattc atggaatgat gcttcaaaga 
gtcgctgcaa ctggaaacca gtcaattgtt 
gccgtagcca aacatgccaa gaatcttgga 
aaattagctg gattgacagc cataccgatg 
ggcaagacac ttcgcgatat ggcaaacagt 
ttgtgtgaag cgcttgaagt tttggaagat 
gttgctgcat tgctttatca tcgtgctaat 
gctgactaca ccttctccgc agcctctcaa 
actggaatca agctcatgaa aaattggggc 
acagtttgca aggaaaccgg aaacaacttc 
gcggctcgtg tggataacga tatcaaggcg 
tcgaagcact tgaatgcgtg tggatcacat 
cttcattcac ttaatctctt caattggctt 
cgatataaac caaattcgaa ctttgttctg 
cttcaagtat tttaccacat tcgggaggca 
gaagaagatt acactgatga gcaaatgtcg 
gacgatccac catttgatag aattctgaaa 
cgagtcttcc atcgtgtcct caaagaactt 
cacttgcgtc atgcgatctg cctcaaggat 
gacgcgacgt tcaatgagat gcaatattcg 
aggaaacagc tggaagaaga cttggtgtat 
gagattcgta acaagcgaaa gatgatcgtg 
cagataatgt tcgaaaaaga gctgagtcaa 
gaatttgatt ttgtcacaaa tatgactaat 
gtcgatgctc cacgccctca gggatatatt 
cgtcgtcgtt tcgatcgact tccacgaaga 
gccagattca gccatcgtac aggatgcatc 
cgcgccaaga atcatactct gatggcttcc 
tctcgatttg agccaaactt tgagattgtg 
tatattcgag gacaaaccgg aaagagtgcg 
gagccaacta accgagttcc acaaatgttc 
agagagtcgg cgagaagaca tcttcatgct 
aagacgacac tctacgaagt tgcatccgtt 
agaaactatc cagcatcaca aatcgacatt 
ttcaatggaa gttattatcc ggatgatatg 
agttcttcat ccatcggaca acctcttcca 
ccgccacgac taacggaagc tcaccacatc 
gatatgatcc cattccgact tctctacgac 
atgtactatg caatgaagaa gcaattgctg 
tatcattgca atctgacacc aatgggacct 
gtccttagca atccttcata tagattcgaa 
caacactttg gacatgaagt tccattccga 
gttgcacagg atggtgactt gttatggagt 
aaggaacctg aagttatcat gagaccgtta 
tgcgacaaat cgcgtttgca ggtattcgcg 
gtcgcgagca agcttcgaaa cacgaatagc 
gtgtcgctga tcagtcgagc caaggattcg 
cacgcgtggt tctag 
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attgacaaag aatacgatca ttggatggag 8820 
cagtggcaaa atgtggccga cgtatgcaat 8880 
atcaacgcag catctcacat tccggactgg 8940 
gctggatgta ttccaccaag tttccattta 9000 
gttatgcgaa tgaatgaaaa ctcaagcccg 9060 
attcaagagt gcacagaagc tcatattagt 9120 
tatggtcatg tcaagattct tcaggcaatg 9180 
gatattcgca ttgctctgct cgaggcccca 9240 
gatatgaagt cgttgatgaa agtattccga 93 00 
ggattcgttt cgacttggta tgattggagg 9360 
ttcgaatatt gggataaagt aggactcaac 9420 
ccgattcatt caatggctca agcacagttg 9480 
ttccataatt taacgaaaga tctactcaac 9540 
atggatgctc aagataaagt ttgcacttac 9600 
gcggctgacg aaagagtgaa aaatgagcta 9660 
gtgcgaattg atgatctaca gaaggatcag 9720 
attcattcag ttcttgatca agctgaaaat 9780 
cttgtcgact tgcaaaatag tgtgacaacc 9840 
caccatcttt acaagagatt cttctctacg 9900 
ggacggcagg ctctcgcttg ttacttcatt 9960 
agaaaaccga ttgccaagat tttgtggctc 10020 
gaagtgatga atcgggttat taagaagcaa 10080 
tactggcttc cacaattggt tactgatgtt 10140 
attctctgca agatggctgc tgctcatcca 10200 
gttagcgttg acgatattga ctcggttctc 10260 
atggatgttt cggatgagga ttgttttgca 10320 
atatgtctga aatatcgtcc aactgatatt 10380 
gacgagatga atgagacatg ggttgaacgt 10440 
cagatgttca aagatttctc ggaacaaatg 10500 
gaggatgtga ctatgatgac gttgagatgg 10560 
ttccaacaga attataatct tgatttcctg 10620 
acgaagggat gtatgggagt cgagaaaagt 10680 
gtgttcacagj agccggccgg catgcaagat 10740 
atgatggtct cacagttgga tattcatgca 10800 
cgtattgttc tcgactggat tcgagcgatt 10860 
atccctctgg aatcgtcaag cccatatctc 10920 
gaaatgccat acgatttgct caacgttttg 10980 
aatcaaacgg ggcaatacat atccatgctc 11040 
atcaaaggtg gtcaagtgat aagaaagatc 11100 
gcgttttatc tgaagaaatc tgtgcaggat 11160 
aaacatcttg atcacgttct acaaaccgat 11220 
ccaacagtgc tgcagatgag agtcggacag 11280 
caaccatatg caatgccacc ggattgtacc 11340 
gttcatccat atgatgtgct gactgccact 11400 
gtattgcact tctttgagag attcgcccaa 11460 
actccgacga accaagatgg aacagttgct 11520 
aagaatatta tttatgaaga ctttgcccga 11580 
tacctcactg cacgatatcc tgatccggtt 11640 
cacagtctcg ccgtcctatc cacaatcgaa 11700 
gatcaaatga tgatgacaat gaatactgga 11760 
atccgaggag gacgatcact tcatgatatt 11820 
ttgactccaa atctatcgat tttggttggt 11880 
atggctgctg cgtcaaaatg tttgatgaag 11940 
gtatgggatg aattcgccaa caatacagat 12000 
tgtcatgcat cgaattctta catcaatggt 12060 
gccgacgcca aactcagaaa ggacgattgt 12120 
gataatctgg cccgaatgcc acccacctac 12180 

12195 
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<210> 14 
<211> 4064 
<212> PRT 

<213> Caenorhabditis elegans 
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Ala Met Asn Gly Ser Pro Asn lie Val Thr Gin Gly Leu Arg Thr Leu 
865 870 875 880 

Glu Leu Cys Val Asp Asn Leu Gin Pro Glu Tyr Leu Leu Glu Asn Met 

885 890 895 

Leu Pro Val Arg Gly Ala Leu Met Gin Gly Leu Trp Arg Val Val Ser 

900 . 905 910 

Lys Ala Pro Asp Thr Ser Ser Met Thr Ala Ala Phe Arg lie Leu Gly 

915 920 925 

Lys Phe Gly Gly Ala Asn Arg Lys Leu Leu Asn Gin Pro Gin He Leu 

930 935 940 

Gin Val Ala Thr Leu Gly Asp Thr Val Gin Ser Tyr He Asn Met Glu 
945 950 955 960 

Phe Ser Arg Met Gly Leu Asp Gly Asn His Ser He His Leu Pro Leu 

965 970 975 

Ser Glu Leu Met Arg Val Val Ala Asp Gin Met Arg Tyr Pro Ala Asp 

980 985 "* 990 

Met He Leu Asn Pro Ser Pro Ala Met He Pro Ser Thr His Met Lys 

995 1000 1005 

Lys Trp Cys Met Glu Leu Ser Lys Ala Val Leu Leu Ala Gly Leu Gly 

1010 1015 1020 

Ser Ser Gly Ser Pro He Thr Pro Ser Ala Asn Leu Pro Lys He He 
1025 1030 1035 1040 

Lys Lys Leu Leu Glu Asp Phe Asp Pro Asn Asn Arg Thr Thr Glu Val 

1045 1050 ~ 1055 

Tyr Thr Cys Pro Arg Glu Ser Asp Arg Glu Leu Phe Val Asn Ala Leu 

1060 1065 1070 

Leu Ala Met Ala Tyr Gly lie Trp Asn Lys Asp Gly Phe Arg His Val 

1075 1080 * 1085 

Tyr Ser Lys Phe Phe He Lys Val Leu Arg Gin Phe Ala Leu He Gly 

1090 1095 1100 

Val Leu Glu Tyr He Gly Gly Asn Gly Trp Met Arg His Ala Glu Glu 
1105 1110 1115 1120 

Glu Gly Val Leu Pro Leu Cys Leu Asp Ser Ser Val Met Val Asp Ala 

1125 1130 1135 

Leu He He Cys Leu Ser Glu Thr Ser Ser Ser Phe He He Ala Gly 

1140 1145 1150 

Val Met Ser Leu Arg His He Asn Glu Thr Leu Ser Leu Thr Leu Pro 

1155 1160 1165 

Asp lie Asp Gin Met Ser Lys Val Pro Met Cys Lys Tyr Leu Met Glu 

1170 1175 1180 

Lys Val Phe Lys Leu Cys His Gly Pro Ala Trp Tyr Ala Arg Ser Gly 
1185 1190 1195 ~ 1200 

Gly He Asn Ala He Gly Tyr Met He Glu Ser Phe Pro Arg Lys Phe 

1205 1210 1215 

Val Met Asp Phe Val He Asp Val Val Asp Ser He Met Glu Val He 

1220 1225 1230 

Leu Gly Thr Val Glu Glu He Ser Ser Gly Ser Ala Asp Ser Ala Tyr 

1235 1240 1245 

Asp Cys Leu Lys Lys Met Met Arg Val Tyr Phe He Lys Glu Glu Gly 

1250 1255 1260 

Gin Glu Glu Glu Asn Leu Thr Leu Ala Thr He Phe Val Ser Ala He 
1265 1270 1275 1280 

Ser Lys His Tyr Phe His Ser Asn Glu Arg Val Arg Glu Phe Ala He 

1285 1290 ~ 1295 

Gly Leu Met Asp His Cys Met Val His Ser Arg Leu Ala Pro Ser Leu 

1300 1305 1310 

Asp Lys Phe Tyr Tyr Arg Phe Lys Glu Phe Phe Glu Pro Glu Leu Met 
1315 1320 1325 
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Arg Val Leu Thr Thr Val Pro Thr Met Ser Leu Ala Asp Ala Gly Gly 

1330 1335 1340 

Ser Leu Asp Gly Val Gin Asn Tyr Met Phe Asn Cys Pro Asp Gly Phe 
1345 1350 1355 1360 

Asp Phe Glu Lys Asp Met Asp Met Tyr Lys Arg Tyr Leu Ser His Leu 

1365 1370 1375 

Leu Asp lie Ala Gin Thr Asp Thr Phe Thr Leu Asn Gin Arg Asn Ala 

1380 1385 1390 

Phe Lys Lys Cys Glu Thr Cys Pro Ser His Phe Leu Pro Pro Phe Pro 

1395 1400 1405 

lie Thr Thr His lie Asp Ser Met Arg Ala Ser Ala Leu Gin Cys Leu 

1410 1415 1420 

Val lie Ala Tyr Asp Arg Met Lys Lys Gin Tyr lie Asp Lys Gly lie 
1425 1430 1435 1440 

Glu Leu Gly Asp Glu His Lys Met lie Glu lie Leu Ala Leu Arg Ser 

1445 1450 1455 

Ser Lys lie Thr Val Asp Gin Val Tyr Glu Ser Asp Glu Ser Trp Arg 

1460 * 1465 1470 

Arg Leu Met Thr Val Leu Leu Arg Ala Val Thr Asp Arg Glu Thr Pro 

1475 1480 1485 

Glu lie Ala Glu Lys Leu His Pro Ser Leu Leu Lys Val Ser Pro lie 

1490 " 1495 1500 

Ser Thr lie lie lie Ala Thr Phe Gly Ala Ser Tyr lie Arg Asn lie 
1505 1510 1515 1520 

Ser Gly Ala Gly Asp Asp Ser Asp Ser Asp Arg His lie Ser Tyr Asn 

1525 1530 1535 

Asp lie Met Lys 'Phe Lys Cys Leu Val Glu Leu Asn Pro Lys lie Leu 

1540 1545 1550 

Val Thr Lys Met Ala Val Asn Leu Ala Asn Gin Met Val Lys Tyr Lys 

1555 1560 1565 

Met Ser Asp Lys lie Ser Arg lie Leu Ser Val Pro Ser Ser Phe Thr 

1570 1575 1580 

Glu Glu Glu Leu Asp Asp Phe Glu Ala Glu Lys Met Lys Gly He Arg 
1585 1590 1595 1600 

Glu Leu Asp Met He Gly His Thr Val Lys Met Leu Ala Gly Cys Pro 

1605 1610 1615 

Val Thr Thr Phe Thr Glu Gin He He Val Asp He Ser Arg Phe Ala 

1620 1625 1630 

Ala His Phe Glu Tyr Ala Tyr Ser Gin Asp Val Leu Val Asn Trp He 

1635 1640 1645 

Asp Asp Val Thr Val He Leu Asn Lys Ser Pro Lys Asp Val Trp Lys 

1650 1655 1660 

Phe Phe Leu Ser Arg Glu Ser He Leu Asp Pro Ala Arg Arg Ser Phe 
1665 1670 1675 1680 

He Arg Arg lie He Val Tyr Gin Ser Ser Gly Pro Leu Arg Gin Glu 

1685 1690 1695 

Phe Met Asp Thr Pro Glu Tyr Phe Glu Lys Leu He Asp Leu Asp Asp 

1700 1705 1710 

Glu Glu Asn Lys Asp Glu Asp Glu Arg Lys lie Trp Asp Arg Asp Met 

1715 ~ 1720 1725 

Phe Ala Phe Ser He Val Asp Arg He Ser Lys Ser Cys Pro Glu Trp 
1730 1735 1740 

-Leu He. Ser Ero Asn Ser Pro He Pro Arg He Lys Lys- Leu Phe Ser 
1745 1750 1755 1760 

Glu Thr Glu Phe Asn Glu Arg Tyr Val Val Arg Ala Leu Thr Glu Val 

1765 1770 1775 

Lys Lys Phe Gin Glu Glu He He Val Lys Arg Met Thr Glu His Lys 
1780 1785 1790 
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Tyr Lys Val Pro Lys Leu He Leu Asn Thr Phe Leu Arg Tyr Leu Arg 

1795 1800 1805 

Leu Asn He Tyr Asp Tyr Asp Leu Phe He Val He Ala Ser Cys Phe 

1810 1815 1820 

Asn Gly Asn Phe Val Thr Asp Leu Ser Phe Leu Arg Glu Tyr Leu Glu 
1825 1830 1835 1840 

Thr Glu Val He Pro Lys Val Pro Leu Gin Trp Arg Arg Glu Leu Phe 

1845 1850 1855 

Leu Arg lie Met Gin Lys Phe Asp Thr Asp Pro Gin Thr Ala Gly Thr 

1860 1865 1870 

Ser Met Gin His Val Lys Ala Leu Gin Tyr Leu Val He Pro Thr Leu 

1875 1880 1885 

His Trp Ala Phe. Glu Arg Tyr Asp Thr Asp Glu He Val Gly Thr Ala 

1890 1895 1900 

Pro lie Asp Asp Ser Asp Ser Ser Met Asp Val Asp Pro Ala Gly Ser 
1905 1910 1915 1920 

Ser Asp Asn Leu Val Ala Arg Leu Thr Ser Val He Asp Ser His Arg 

1925 1930 1935 

Asn Tyr Leu Ser Asp Gly Met Val He Val Phe Tyr Gin Leu Cys Thr 

1940 1945 1950 

Leu Phe Val Gin Asn Ala Ser Glu His He His Asn Asn Asn Cys Lys 

1955 1960 1965 

Lys Gin Gly Gly Arg Leu Arg He Leu Met Leu Phe Ala Trp Pro Cys 

1970 1975 1980 

Leu Thr Met Tyr Asn His Gin Asp Pro Thr Met Arg Tyr Thr Gly Phe 
1985 1990 1995 ~ 2000 

Phe Phe Leu Ala Asn He He Glu Arg Phe Thr lie Asn Arg Lys lie 

2005 2010 2015 

Val Leu Gin Val Phe His Gin Leu Met Thr Thr Tyr Gin Gin Asp Thr 

2020 2025 2030 

Arg Asp Gin He Arg Lys Ala lie Asp lie Leu Thr Pro Ala Leu Arg 

2035 2040 2045 

Thr Arg Met Glu Asp Gly His Leu Gin. He Leu Ser His Val Lys Lys 

2050 2055 2060 

lie Leu He Glu Glu Cys His Asn Leu Gin His Val Gin His Val Phe 
2065 2070 2075 2080 

Gin Met Val Val Arg Asn Tyr Arg Val Tyr Tyr His Val Arg Leu Glu 

2085 2090 2095 

Leu Leu Thr Pro Leu Leu Asn Gly Val Gin Arg Ala Leu Val Met Pro 

2100 2105 * 2110 

Asn Ser Val Leu Glu Lys Phe Ser Trp Gin Thr Arg Arg His Ala Val 

2115 2120 2125 

Glu He Cys Glu Met Val He Lys Trp Glu Leu Phe Arg Thr Leu Lys 

2130 2135 2140 

Thr Asp His He He Ser Asp Glu Glu Ala Leu Glu Val Asp Lys Gin 
2145 2150 2155 " 2160 

Leu Asp Lys Leu Arg Thr Ala Ser Ser Thr Asp Arg Phe Asp Phe Glu 

2165 2170 2175 

Glu Ala His Asn Lys Arg Asp Met Pro Asp Ala Gin Arg Thr lie lie 

2180 2185 2190 

Lys Glu His Ala Asp Val He Val Asn Met Leu Val Arg Phe Cys Met 
2195 2200 2205 

' Thr Phe His Glri Asn ^Ser Gly Ser Ser Ser Thr Ser Gin Ser Gly Asn 
2210 2215 2220 

His Gly Val Glu Leu Thr Lys Lys Cys Gin Leu Leu Leu Arg Ala Ala 
2225 2230 2235 2240 

Leu Arg Pro Ser Met Trp Gly Glu Phe Val Ser Phe Arg Leu Thr Met 
2245 2250 2255 
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lie Glu Lys Phe Leu Ser lie Pro Asn Asp Asn Ala Leu Arg Asn Asp 

2260 2265 2270 

lie Ser Ser Thr Ala Tyr Ala Asn Thr lie Gin Asn Ala Gin His Thr 

2275 2280 2285 

Leu Asp Met Leu Cys Asn lie lie Pro Val Met Pro Lys Thr Ser Leu 

2290 2295 2300 

Met Thr Met Met Arg Gin Leu Gin Arg Pro Leu He Gin Cys Leu Asn 
2305 2310 2315 2320 

Asn Gly Ala Gin Asn Phe Lys Met Thr Arg Leu Val Thr Gin He Val 

2325 2330 2335 

Ser Arg Leu Leu Glu Lys Thr Asn Val Ser Val Asn Gly Leu Asp Glu 

2340 2345 2350 

Leu Glu Gin Leu Asn Gin Tyr He Ser Arg Phe Leu His Glu His Phe 

2355 2360 2365 

Gly Ser Leu Leu Asn Cys Arg Asn Leu Ser Gly Pro Val Leu Gly Val 

2370 2375 2380 

Leu Gly Ala Phe Ser Leu Leu Arg Thr He Cys Gly His Glu Pro Ala 
2385 2390 2395 2400 

Tyr Leu Asp His Leu Met Pro Ser Phe Val Lys Val Met Glu Arg Ala 

2405 2410 2415 

Ala Lys Glu His Leu Ala Tyr Val Ala Asn Ser Gin Asp Gly Asn Met 

2420 2425 2430 

Val Lys Asn Phe Phe Pro Asp Val Ala Glu Leu Leu Cys Ala Cys Met 

2435 2440 2445 

Glu Leu Val Arg Pro Arg Val Asp His He Ser Met Glu He Lys Arg 

2450 2455 2460 

Ser He Val Gly Gly He He Ala Glu Leu He He Lys Ser Asn His 
2465 2470 2475 2480 

Asp Lys He He Gin Thr Ser Val Lys Leu Leu Gly Ala Met He Ser 

2485 2490 2495 

Thr Gin Asp Met Glu Phe Thr He Leu Thr Val Leu Pro Leu Leu Val 

2500 2505 2510 

Arg He Gin Ser He He Val Thr Lys Phe Lys Asn Cys Lys Asp Leu 

2515 2520 2525 

He Ala Asp Tyr Leu Val Val Val He Thr Val Phe Glu Asn Ser Glu 

2530 2535 2540 

Tyr Arg Asn Ser Glu Ala Gly Ser Arg Leu Trp Glu Gly Phe Phe Trp 
2545 2550 2555 2560 

Gly Leu Lys Ser Ser Asp Pro Gin Thr Arg Glu Lys Phe Ser He Val 

2565 2570 " 2575 

Trp Glu Lys Thr Trp Pro His Met Ala Thr Val Asp He Ala His Arg 

2580 2585 2590 

Met Lys Tyr He Met Gin Asn Gin Asp Trp Ser Lys Phe Lys His Ala 

2595 2600 2605 

Phe Trp Leu Lys Phe Ala Leu Trp Gly Met Leu Arg Thr He Ala Lys 

2610 2615 2620 

Arg Pro Thr Asp Pro Asn Asn Lys Arg Lys Lys Val lie Leu Leu Asn 
2625 2630 2635 2640 

Cys Ala Thr Pro Trp Arg Thr He Glu Tyr Ala Ala Lys Leu Lys Asp 

2645 ~ 2650 2655 

Gin Pro Met Glu Val Glu Thr Glu Met Lys Arg Glu Glu Pro Glu Pro 

2660 ' 2665 2670 

Met Glu Val Asp Glu Lys Asp Ser Gin Asp Asp Ser Lys Asp Ala Gly 

2675 2680 2685 

Glu Pro Lys Glu Lys Glu Lys Leu Thr Leu Glu Leu Leu Leu Ala Gly 

2690 2695 2700 

Gin Gin Glu Leu Leu Asp Glu Ala Ser Asn Tyr Asp Phe Ala Asp Ala 
2705 2710 2715 2720 
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Leu Asp Thr Val Ser Gin He Thr Phe Ala Leu Asn Glu Asn Gin Val 

2725 2730 2735 

Thr Ser Lys Met Trp Val Val Leu Phe Lys Ser Phe Trp Ser Ser Leu 

2740 2745 2750 

Ser Gin Ser Glu He Glu Asp Phe Thr Ala Leu Val Val Pro Phe Met 

2755 2760 2765 

Ser Ser Gly Val His Asn Asn Tyr Gin Thr Gly Val Gin Asp Ser Val 

2770 2775 2780 

Leu Ala Val Trp Leu Glu Ala Val Gly Asp Ala Val His Leu Pro Ser 
2785 2790 2795 2800 

Arg Leu He Glu Phe He Ser Ser Lys His Glu Cys Trp His Thr Gly 

2805 2810 * " 2815 

He Arg Leu Leu Glu Asn His He Trp Thr He Pro Lys Gin Leu Asn 

28 20 2825 2830 

Asn Thr Leu Leu Arg Glu Met Lys Val Ala Pro Gly Leu Ala Gly Asd 

2835 2840 2845 

He Glu Thr Leu Glu Ser Leu Gly Thr Leu Tyr Asn Glu He Ser Glu 

2850 2855 2860 

Phe Asp Gin Phe Ala Ala He Trp Glu Arg Arg Ala Val Phe Pro Asp 
2865 2870 2875 2880 

Thr Met Arg Ala Met Ser Ala Met Gin Leu Gly Asp Met Glu Leu Ala 

2885 2890 2895 

Gin Ser Tyr Leu Glu Lys Ser Met Ser Ser Thr Tyr Glu Thr Leu Ala 

2900 2905 ' 2910 

Pro Thr He Asn Pro Asn Asn Thr Ser Asn Ser Glu Lys His Val Ser 

2915 2920 2925 

Pro He He Asp Lys Glu Tyr Asp His Trp Met Glu Met Tyr He Thr 

2930 2935 2940 

Asn Cys Ser Glu Leu Leu Gin Trp Gin Asn Val Ala Asp Val Cys Asn 
Jf«5 2950 2955 * 2960 

Gly Lys Asp Met Gin His Val Arg Gly Leu He Asn Ala Ala Ser His 

2965 2970 2975 

He Pro Asp Trp Asn Val Val Glu Glu Cys Lys Ser Gin He Ala Gly 

2980 2985 2990 

Cys He Pro Pro Ser Phe His Leu Asp Tyr Thr Leu Phe Asn Leu Met 

2995 3000 3005 

Ser Thr Val Met Arg Met Asn Glu Asn Ser Ser Pro Thr His Met Lvs 

3010 3015 3020 

Glu Arg Cys Lys He Ala He Gin Glu Cys Thr Glu Ala His He Ser 
3025 3030 3035 3040 

Arg Trp Arg Ala Leu Pro Ser Val Val Ser Tyr Gly His Val Lys He 

3 °45 3050 3055 

Leu Gin Ala Met Asn Leu Val Arg Glu He Glu Glu Ser Thr Asp He 

3060 3065 3070 

Arg He Ala Leu Leu Glu Ala Pro Ser Asn Lys Val Asp Gin Ala Leu 

3 °75 3080 3085 

Met Gly Asp Met Lys Ser Leu Met Lys Val Phe Arg Asn Arg Thr Pro 
3090 3095 3100 

^J hX Ser Asp Asp Met Gl y phe Val Ser Thr Trp Tyr Asp Trp Arg 
31 °5 , 3110 3115 3120 

Asn Gin He His Gly Met Met Leu Gin Arg Phe Glu Tyr Trp Asp Lys 
3125 3130 3135 

-Val Gly Leu Asn Val Ala Ala Thr Gly Asn Gin Ser He Val Pro He 
3140 3145 3150 

His Ser Met Ala Gin Ala Gin Leu Ala Val Ala Lys His Ala Lys Asn 
3155 3160 316 5 

LSU ^L Phe His Asn Leu Thr h Y s As P Leu L eu Asn Lys Leu Ala Gly 
3170 3i7 5 318Q 
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Leu Thr Ala lie Pro Met Met Asp Ala Gin Asp Lys Val Cys Thr Tyr 
3185 3190 3195 3200 

Gly Lys Thr Leu Arg Asp Met Ala Asn Ser Ala Ala Asp Glu Arg Val 

3205 3210 3215 

Lys Asn Glu Leu Leu Cys Glu Ala Leu Glu Val Leu Glu Asp Val Arg 

3220 3225 3230 

lie Asp Asp Leu Gin Lys Asp Gin Val Ala Ala Leu Leu Tyr His Arg 

3235 3240 3245 

Ala Asn lie His Ser Val Leu Asp Gin Ala Glu Asn Ala Asp Tyr Thr 

3250 3255 3260 

Phe Ser Ala Ala Ser Gin Leu Val Asp Leu Gin Asn Ser Val Thr Thr 
3265 3270 3275 3280 

Thr Gly lie Lys Leu Met Lys Asn Trp Gly His His Leu Tyr Lys Arg 

3285 3290 3295 

Phe Phe Ser Thr Thr Val Cys Lys Glu Thr Gly Asn Asn Phe Gly Arg 

3300 3305 3310 

Gin Ala Leu Ala Cys Tyr Phe lie Ala Ala Arg Val Asp Asn Asp lie 

3315 3320 3325 

Lys Ala Arg Lys Pro lie Ala Lys lie Leu Trp Leu Ser Lys His Leu 

3330 3335 3340 

Asn Ala Cys Gly Ser His Glu Val Met Asn Arg Val lie Lys Lys Gin 
3345 ' 3350 3355 " 3360 

Leu His Ser Leu Asn Leu Phe Asn Trp Leu Tyr Trp Leu Pro Gin Leu 

3365 3370 3375 

Val Thr Asp Val Arg Tyr Lys Pro Asn Ser Asn Phe Val Leu lie Leu 

3380 " " 3385 3390 

Cys Lys Met Ala Ala Ala His Pro Leu Gin Val Phe Tyr His lie Arg 

3395 3400 3405 

Glu Ala Val Ser Val Asp Asp lie Asp Ser Val Leu Glu Glu Asp Tyr 

3410 3415 3420 

Thr Asp Glu Gin Met Ser Met Asp Val Ser Asp Glu Asp Cys Phe Ala 
3425 3430 3435 3440 

Asp Asp Pro Pro Phe Asp Arg lie Leu Lys lie Cys Leu Lys Tyr Arg 

3445 3450 3455 

Pro Thr Asp lie Arg Val Phe His Arg Val Leu Lys Glu Leu Asp Glu 

3460 ~ 3465 3470 

Met Asn Glu Thr Trp Val Glu Arg His Leu Arg His Ala He Cys Leu 

3475 3480 3485 

Lys Asp Gin Met Phe Lys Asp Phe Ser Glu Gin Met Asp Ala Thr Phe 

3490 3495 3500 

Asn Glu Met Gin Tyr Ser Glu Asp Val Thr Met Met Thr Leu Arg Trp 
3505 3510 3515 3520 

Arg Lys Gin Leu Glu Glu Asp Leu Val Tyr Phe Gin Gin Asn Tyr Asn 

3525 3530 3535 

Leu Asp Phe Leu Glu He Arg Asn Lys Arg Lys Met He Val Thr Lys 

3540 3545 3550 

Gly Cys Met Gly Val Glu Lys Ser Gin He Met Phe Glu Lys Glu Leu 

3555 3560 3565 

Ser Gin Val Phe Thr Glu Pro Ala Gly Met Gin Asp Glu Phe Asp Phe 

3570 3575 3580 

Val Thr Asn Met Thr Asn Met Met Val Ser Gin Leu Asp He His Ala 
3585 3590 3595 3600 

-Val Asp Ala Pro Arg Pro Gin Gly Tyr He Arg lie Val Leu Asp Trp 

3605 3610 3615 

He Arg Ala He Arg Arg Arg Phe Asp Arg Leu Pro Arg Arg He Pro 

3620 3625 3630 

Leu Glu Ser Ser Ser Pro Tyr Leu Ala Arg Phe Ser His Arg Thr Gly 
3635 3640 3645 
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Cys lie Glu Met Pro Tyr Asp Leu Leu Asn Val Leu Arg Ala Lys Asn 

3650 3655 3660 

His Thr Leu Met Ala Ser Asn Gin Thr Gly Gin Tyr He Ser Met Leu 
3665 3670 3675 3680 

Ser Arg Phe Glu Pro Asn Phe Glu He Val He Lys Gly Gly Gin Val 

3685 3690 * ' 3695 

He Arg Lys He Tyr lie Arg Gly Gin Thr Gly Lys Ser Ala Ala Phe 

3700 3705 3710 

Tyr Leu Lys Lys Ser Val Gin Asp Glu Pro Thr Asn Arg Val Pro Gin 

3715 3720 3725 

Met Phe Lys His Leu Asp His Val Leu Gin Thr Asp Arg Glu Ser Ala 

3730 3735 3740 

Arg Arg His Leu His Ala Pro Thr Val Leu Gin Met Arg Val Gly Gin 
3745 3750 3755 " 3760 

Lys Thr Thr Leu Tyr Glu Val Ala Ser Val Gin Pro Tyr Ala Met Pro 

3765 3770 * 3775 

Pro Asp Cys Thr Arg Asn Tyr Pro Ala Ser Gin He Asp He Val His 

3780 3785 3790 

Pro Tyr Asp Val Leu Thr Ala Thr Phe Asn Gly Ser Tyr Tyr Pro Asp 

3795 3800 ~ 3805 

Asp Met Val Leu His Phe Phe Glu Arg Phe Ala Gin Ser Ser Ser Ser 

3810 3815 3820 

He Gly Gin Pro Leu Pro Thr Pro Thr Asn Gin Asp Gly Thr Val Ala 
3825 3830 3835 ~ 3840 

Pro Pro Arg Leu Thr Glu Ala His His He Lys Asn He He Tyr Glu 

3845 3850 3855 

Asp Phe Ala Arg Asp Met He Pro Phe Arg Leu Leu Tyr Asp Tyr Leu 

3860 3865 3870 

Thr Ala Arg Tyr Pro Asp Pro Val Met Tyr Tyr Ala Met Lys Lys Gin 

3875 3880 3885 

Leu Leu His Ser Leu Ala Val Leu Ser Thr He Glu Tyr His Cys Asn 

3890 3895 3900 

Leu Thr Pro Met Gly Pro Asp Gin Met Met Met Thr Met Asn Thr Gly 
3905 3910 3915 3920 

Val Leu Ser Asn Pro Ser Tyr- Arg Phe Glu He Arg Gly Gly Arg Ser 

3925 3930 3935 

Leu His Asp He Gin His Phe Gly His Glu Val Pro Phe Arg Leu Thr 

3940 3945 3950 

Pro Asn Leu Ser He Leu Val Gly Val Ala Gin Asp Gly Asp Leu Leu 

3955 3960 3965 

Trp Ser Met Ala Ala Ala Ser Lys Cys Leu Met Lys Lys Glu Pro Glu 

3970 3975 3980 

Val He Met Arg Pro Leu Val Trp Asp Glu Phe Ala Asn Asn Thr Asp 
3985 3990 3995 4000 

Cys Asp Lys Ser Arg Leu Gin Val Phe Ala Cys His Ala Ser Asn Ser 

4005 4010 4015 

Tyr He Asn Gly Val Ala Ser Lys Leu Arg Asn Thr Asn Ser Ala Asp 

4020 4025 4030 

Ala Lys Leu Arg Lys Asp Asp Cys Val Ser Leu lie Ser Arg Ala Lys 

4035 4040 4045 

Asp Ser Asp Asn Leu Ala Arg Met Pro Pro Thr Tyr His Ala Trp Phe 
4050 4055 4060 



<210> 15 
<211> 4896 
<212> DNA 

<213> Caenorhabditis elegans 
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<400> 15 

ttgttttcgg attttttgtg tgcttcgtag 
aatgtaacat ttgaattttg aaattgaagg 
aagaccgata cattcttgca acacatgact 
aacttggaat ttgatgaaaa agtacagtaa 
gatttttgat ttttcctaga ttttttaagc 
atgcttttca atttcggttt tcaacaaaat 
taactcggct actgtaccat ttaaaggcgc 
tttttgttcg tggctcaaca gtgcaatgga 
ttcagttcat tttttaagca tcttcaaaaa 
tttttttttg aaagtacaat cgtacattat 
attcaagatc atttcgcaaa ataattgcct 
cacccttgtt attctttttt gaattgccgc 
cgaggttggt ttctaggcca gccggcgcgt 
atttcttgca tttttaaagt tttttattga 
tattttaaaa ttaaaaaact agtttgttca 
ggattatgca cgcaagaaag tactatcgtt 
aatttttcac acaaaattgt actatttcca 
gattatagag gacgaaaatc atggaatatc 
cgagaaagtt acagagggat gccggttatt 
ttagttttta catctattta aacacatttt 
ttcaagatgc cgagctgcaa atggttcaat 
ccgaagactt gacgaatggg ttcagtctga 
aaaaaaagga ggaaagaaag gagcacactt 
tttcaaaaga ttttaaatag ttttatcaat 
aaatgaagga aagaaaagcg gccgaaaacg 
caaggcggaa tccgtagatc cattacaagc 
aagtcttcga ggttccatgt cgatggtcgg 
aaatgtcgaa tgcattgaac taggaagatc 
tccacaacaa ttgacaagtt tggattgtat 
aaagtcga'aa acttgtctga aacggcacat 
gaatataaat aactgttttc aaaattcaaa 
ggtgaattcg aaaaaattcg agttcttgtg 
gctttttccg ttgatattag ttttgaaaca 
aatcgcaaat tctgggaatt tgctccaaaa 
acgaaaaaaa aattcacaaa cggtgtttca 
gcaaataggc cattattctg cgtgggaatt 
gcaaaatgga aaaaaaacgt aaaaaataga 
cggtccatac tcttcatttt ctatcattta 
ttcaggaaaa atgtgcaatg tgtcacccac 
tttcattttt tgaaatcgac ggccgcaaaa 
ttgccaaact ttttctggat cacaagactc 
atgtgctaac cgaagaagac gagaagggtc 
aagaatcagc tgaagaatat aatgttgcgt 
aaggatacgg aagtttgctc atcgaattca 
caggatcacc cgaaaaacca ctatcagatt 
caatggccat catgaaagag cttttcgcat 
cagttcagga catttcacaa agtacatcga 
agcaacttga tctatacaaa tactataagg 
agcgtcaagt ttatgagaaa cggattgagg 
cagctctgca atggcgaccc aaagagtacg 
aaattcgtgt ttacggctaa aaactgaaaa 
tttttttcaa aaaaccaaaa aaaaacaatt 
aaaaaaaaac ggtttacgcc ctatttcata 
aatttgaccc tacaattttt ttccagtttt 
actggaaata ctaaatacta aggaaaaaaa 
tgaaattttc taataaaatc atttttcttt 
aatctttcag tttttgcgaa ttttttttcg 



ttgctccgat gatgccggat tcaacatttg 60 
aattcatttg aatctaaagc ttgcagggtc 120 
cgaaagtatg taggaaaaat tgaagttgga 180 
tccattctct cttatttcgc aactttcttc 240 
taaaattttg ctgttttatt ttcatttttc 300 
tatgtttttc agagaaaatc tcgtgaacaa 360 
acaccttttc gcgcagcatt gatttaaatt 420 
catctagata tctgaaattt taccactgaa 480 
tttgcgtttt cctaattttc ttgtgatcgt 540 
aaataactat ttttcaattc gaataattta 600 
tgaaacgtta tgccgcggtc aattttcaac 660 
cctttttccc tgtggccggc gcagtgcggc 720 
tttatttttt tcgagcatga tttcacaatt 780 
taaaatagta aaactaacaa cggataatat 840 
tttttggatc gatttttaga tgttgttcat 900 
cacatttgat tgctatatta ttgaatattg 960 
gatatttatc atgaccgagc cgaagaagga 1020 
caagaaaata ccaacagatc ccaggcaata 1080 
ggtcatgatg gcttcacaag aagaagaaag 1140 
ccaattattt tcaggatggg ccgaagttat 1200 
taaattctat gtccattata tcgaittgcaa 1260 
taggctcaat ttagcgtcgt gtgagctacc 1320 
gcgggaagaa aagtgagaaa tctataaact 1380 
tcataattat ttcagtcgag attcgaatga 1440 
aaagattcca ctacttccga tggatgatct 1500 
aatttcaacg atgaccagcg gatctactcc 1560 
ccatagtgaa gatgcaatga caaggatccg 1620 
acgaattcag ccatggtact ttgcacctta 1680 
ttatatttgc gaattttgtc tgaaatatct 1740 
ggtgagtgtt tcgagttata gaaaatgacc 1800 
aattttcaat tttccaaaaa tgaaagaatc 1860 
tgtttttggc tgaatttttc ggtttttctt 1920 
atgtttttaa aattttccgg catcgaaaaa 1980 
attgcatttt tgaaatactt ttttgcgaaa 2040 
aaccaaattt atcgtaatca aaaaagtttc 2100 
caaattaaaa tcagctactt tttctatttt 2160 
caaattttta attttttaaa caattacatt 2220 
attaaaatgc ccaattctaa ttaattttat 2280 
ctggcaatca aatctacagt cacgataaac 2340 
acaaaagcta tgctcagaat ctatgcctgc 2400 
tttactatga cacggatcca tttttgttct 2460 
atcatatagt tggatacttt tcaaaagaaa 2520 
gtattcttgt gttacctcca tttcaaaaga 2580 
gctatgaact ctcgaaaatt gaacagaaga 2640 
tgggacttct ctcatatcga tcgtactggt 2700 
tcaaaagacg acatccaggc gaagatatca 2760 
ttaaacgaga agatgttgtg tcaacgttac 2820. 
gatcatacat aattgtgatt agtgatgaaa 2880 
ctgcgaaaaa gaagacacga attaatccag 2940 
gaaagaaaag agtgagtttt tttcaatcaa 3000 
ttaaaattaa attaaattcg tgataacatt 3060 
tcgtttttgg cagaaccaaa aaaaaaattt 3120 
caaacaacag aaattgcact tttttgagca 3180 
ttgctctttt tcaaaaaaaa acacctaaac 3240 
tggaaatact ggtttacagt gtcaaaaaat 3300 
ttactaaatt tatcaaaaat ttataactca 3360 
aaaaaacgaa aaaaaataaa cctaatttta 3420 
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accaaattgt aattttgaaa aatctggaac gtccggaaaa ctgaaaaatt aaaaaaaaaa 3480 
cttttcagaa atttattttt aaaaaaccgt ttttttaaat caaattttgt atatgttgat 3540 
gagaaaaaaa aatagaaatc aatgttttta agttttaaaa gaaaaattta ttttaattat 3600 
tttagtttta ataaggtatt taaacagtaa caaggatgtc ggtttttcga ttttccgaaa 3660 
aactaaaaaa ttgtcttttt cgatttttta atcgaaaaaa aatagaaata ttttcacaaa 3720 
acatactatt cttctaaaaa aaagaatagt ggcagatttt aaataatttt tgaactctcg 3780 
caattttttt cgaaatatcc aaaaatcgaa aaaccggcac aaaagcaaaa agtctccggg 3840 
aatatatctt taaattattt tatgaacttt tttttcaggc gcagatcatg ttctagcaac 3900 
aacgacatgt gttctcgcca cgacgatctc aacctgtaca ttaaaatata acactccgtt 3960 
ttatctcgca tctacacacc gaaaagctta cgctatccct ttatcattcc cacaccgctc 4020 
agagagcgta cgcctcattt catttcattt gttctgtgta ataatttgac ttattagtca 4080 
cttatttttt taatgaaatt attcttgaat ttcataatct tcttgttgca gttcaaataa 4140 
ttaaaattca tcatatagac aagtaagttt ataactgcaa aagtgaagtt ttctaatcat 4200 
taagcgttct gaagatattc ggcaaccgcc tgagcgatca gatcacggcg ggaacgagtt 4260 
gaggcgtaga catgcttgca gccagtgaca acctgaaaga tattcaaaaa attaatttca 4320 
ggactcgaat ttttaacaat ctgaataaaa aaatccaaaa ttgtatatta tagagttttt 4380 
tgaaatctaa gcgaaagcgc gctccaatgt aaaacgaaaa gtgctccgcc cctaaacgtt 4440 
gggtcccgtt aggaatttgt tattttttcg gttatttctg actatattat aatttcgaaa 4500 
cgacaagtat tttaaacatc atttcgacat aaaaaatatg taaaacaaca aaaaacaatc 4560 
gaaaaaatag tgaaaaagtt tgaatttaca gtctcgccgc ctcctaccga gacctaacgt 4620 
taggaggcgg agcgttttcc tttggcattg aagcgcgctt gctgcggccc cataattaat 4680 
aacttacagc ctttgcaaag tccttcttct gttcatcctc aatctcgtca atgtattgat 4740 
tggacaactt ctcaatctcg gactgttccg cattttcatc cttcaatttt ttgtattgag 4800 
ccttgaattg agccaccttc tcctctccga aagccttaac cgaatactcc ttacaagctt 4860 
ctttcaactt gccctcggcc ttctccttgg catctc 4896 

<210> 16 
<211> 1377 
<212> DNA 

<213> Caenorhabditis elegans 
<400> 16 

atgaccgagc cgaagaagga gattatagag gacgaaaatc atggaatatc caagaaaata 60 
ccaacagatc ccaggcaata cgagaaagtt acagagggat gccggttatt ggtcatgatg 120 
gcttcacaag aagaagaaag atgggccgaa gttatttcaa gatgccgagc tgcaaatggt 180 
tcaattaaat tctatgtcca ttatatcgat tgcaaccgaa gacttgacga atgggttcag 240 
tctgataggc tcaatttagc gtcgtgtgag ctaccaaaaa aaggaggaaa gaaaggagca 300 
cacttgcggg aagaaaatcg agattcgaat gaaaatgaag gaaagaaaag cggccgaaaa 360 
cgaaagattc cactacttcc gatggatgat ctcaaggcgg aatccgtaga tccattacaa 420 
gcaatttcaa cgatgaccag cggatctact ccaagtcttc gaggttccat gtcgatggtc 480 
ggccatagtg aagatgcaat gacaaggatc cgaaatgtcg aatgcattga actaggaaga 540 
tcacgaattc agccatggta ctttgcacct tatccacaac aattgacaag tttggattgt 600 
atttatattt gcgaattttg tctgaaatat ctaaagtcga aaacttgtct gaaacggcac 660 
atggaaaaat gtgcaatgtg tcacccacct ggcaatcaaa tctacagtca cgataaactt 720 
tcattttttg aaatcgacgg ccgcaaaaac aaaagctatg ctcagaatct atgcctgctt 780 
gccaaacttt ttctggatca caagactctt tactatgaca cggatccatt tttgttctat 840 
gtgctaaccg aagaagacga gaagggtcat catatagttg gatacttttc aaaagaaaaa 900 
gaatcagctg aagaatataa tgttgcgtgt attcttgtgt tacctccatt tcaaaagaaa 960 
ggatacggaa gtttgctcat cgaattcagc tatgaactct cgaaaattga acagaagaca 1020 
ggatcacccg aaaaaccact atcagatttg ggacttctct catatcgatc gtactggtca 1080 
atggccatca tgaaagagct tttcgcattc aaaagacgac atccaggcga agatatcaca 1140 
gttcaggaca tttcacaaag tacatcgatt aaacgagaag atgttgtgtc aacgttacag 1200 
caacttgatc tatacaaata ctataaggga tcatacataa ttgtgattag tgatgaaaag 1260 
cgtcaagttt atgagaaacg gattgaggct gcgaaaaaga agacacgaat taatccagca 1320 
gctctgcaat ggcgacccaa agagtacgga aagaaaagag cgcagatcat gttctag 1377 

<210> 17 
<211> 458 
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<212> PRT 

<213> Caenorhabditis elegans 



<400> 17 
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Lys Lys Thr Arg lie Asn Pro Ala Ala Leu Gin Trp Arg Pro Lys Glu 

435 440 445 

Tyr Gly Lys Lys Arg Ala Gin He Met Phe 
450 455 



<210> 18 
<211> 9890 
<212> DNA 

<213> CaenorhaMitis elegans 
<400> 18 

tttcaaaaaa aaaaaattac ctcgtcaatt 
tccattttac ctgaaaagtg tgattttttc 
aaaaaaaaaa ctgaacggaa tgttacgaaa 
cgacaaaaag tttcactgat attcatttca 
taagagaaaa aactgcaaaa caattcgaaa 
taacaaaaat ccactaatga acaagaaatt 
ttggacgaat attttaataa aactttaaaa 
aaaattccaa aaattaaagt tcattctcga 
acgcaacgcg accgcgccgc actcaacggc 
cgtctatttg tgtgtgtgtg cgattgtgtg 
ctagtgagtg ttttccgacg agagacaaca 
aggaagatag tgtggtaaga ggagagtgtg 
ctgtgagaag agaaggagac cccccccccc 
gtgtagggcc ttctgttgta ttccactgct 
aaagtactgc ttaaaacaca gtgctcagct 
catcggcgga tgaattcatc gcaaagtccg 
gctcttctta accgtagtta caacgtggga 
gtttcgagcc cgggcgctcg actcgaaccg 
tccggaccta tcagaatgca gtgttggaaa 
ggaaaaagaa gaagaacagg ttggtttttg 
atttttcgag ttttaatgtc ttttttcgaa 
ttttaattcc gttttccgac tactttgaag 
cactattttt gtctttgcct ttctggatcg 
tttttacact tttaaactta acaattctct 
tttgattttt gatttttgat ttttctcttt 
caactcttaa atcataaaaa aaatcagttt 
aaatcgttat ttttcgacgg aaacttcata 
tttctcaaag tcagctcaat taactaactt 
ttttttaata ttttaattct attttaattt 
gagatgaagg cgaaatctcg agaaaaagca 
actttttcta tcacagaaag tgttctctga 
gacacaattt cccaattatg ccgacttatt 
cccaccctct tgacccctgg tgacgtcatt 
ggggtatttt ttctcaaaat ttttgcaaat 
ctagcaccag gaataataat gcaaatttga 
aatgattatt taaattttaa attttaaatt 
ctgcccagca agccagtaca tcgggtattc 
tcgaccgagt cgaagatcaa cgctatcact 
gttcaaagta tatcaaagtt catggtgagt 
ttaatttccg gttttttgaa attaatttcc 
caaattcctc tctgaattcg aaagaaaata 
cttcaaaaat gtttgttgat tggttccaaa 
aaattcaaaa aaaaattccc tgattttata 
gagcgcgctt gcatcgtttg attttcttcg 
attcattttt gtcgagtttt tttctgccaa 
tcggcgaaaa taaattttga aaaacgaaac 



tcactctcct cgatgcgatg attatcctcg 60 
acgaataaaa ttattttcag atacttctag 120 
ttaattttca aagttgcgaa actgaatttt 180 
agcatattgc aacgttttta aattaatttc 240 
ataattttta caagttactt ttcgaaaaag 3 00 
tttgaacaaa aagagcttct caggctattt 360 
aaatcaacga aaatccccta aaaatcgctg 420 
ccacacctct cgtaaatcag cacgagactc 480 
attgagtaat gcggagcggc agcgtcgcgt 54 0 
tggtgcgacg tggccgctct gtgtgcctct 600 
cattttcgag agacgaagag agtggcgacg 660 
cgcgagggaa agagagcaaa gtgtgagtgt 720 
ccgcgctcaa ccagtcgata gttggcctga 780 
aacccccccc aaacacacaa aaagactcaa 840 
catttcattt ttgattttta tgctcgccgt 900 
tggcgattca acacgtgcgg cgtcctcgcc 960 
gtacagaaag atggccacta cttcgaaggc 1020 
gtctatgact gtatactggg gccacgaact 1080 
ccgggcggtg acacaaatgc cgtctggcat 1140 
gtggattatg gattactgct ccattttgaa .1200 
ttcctggtgc ttttttctat ccgaatcatg 1260 
aattttcaaa tttttgatcc ctgatgacgt 1320 
cttttatagt tattttcatt ttttatttct 1380 
taattcatcc tattctattt aattttaagt 1440 
tctcttttag ccgccggtgg gcctttatta 1500 
aagcagttat acataactct tattatgaaa 1560 
ctttgaattt atttccaatt tagattttat 1620 
aaaatgtttt gtcctacccg caaaatgttt 1680 
ttggctttaa aaaatcattt tgctaagcct 1740 
tttaaaaagt aataaattcc gttaaaaacg 1800 
gtgctaacaa ccttcttctg tccaaatttt 1860 
acaccttttt ccgtcaatct tctagttttt 1920 
tgtttgttct tcttccaaga catgccctgt 1980 
ttattggatt ctaaataaaa ttccaggagt 2040 
aaaaaaaatt aaacagaaat aatgatttta 2100 
tccaggaaaa acacctgcaa gaagcgattg 2160 
agctgaacca tgtcattcca actccaaaag 2220 
ccacttatca caacaagaat aaaatgcacc 2280 
ttttttaacc aaaatttcgg cgaaaataat 2340 
gcttgggttt tcttgtattt attatttttt 2400 
acttgatttt tcagacttcc tggctaaaac 2460 
ttttcgcctg attccgaatt tcgatgtgac 2520 
ttcaagcttt gtgtttgtgt gttctttttg 2580 
tcttttttaa aatttatttt cgcttgtttc 2640 
aatgaatgaa actggtttaa aaaattgaat 2700 
aaatcaaacg atgcaagcgc gctccaatgc 2760 
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gatttttttg ggcgcggaaa ttcgtgattt 
ttcgactttt ttcacgttga aattcggaat 
tttcccagat ttcggtatct ttaatgcatc 
aaaaataaga aaacaaattg cgaaatattg 
tacacttctc tgttttgaag tgagaaagta 
ttgaaaagga tcagaatttg atgacgatat 
cagtttttgc gaaaaaagga gaaaaaccgg 
gcctggcaag cactcgaacg agacgaaccc 
tggctatcag atcacactca cattgacccg 
gagagccatt catcggagac acagatcgcg 
tgtaagttga cgaaatttcc attgaaaccc 
cactggactc atcaatcgtg tacgaaatat 
ctgcgacgac gtctggttgt gttggagtcg 
gtcggaaggt aagaatttga ctattttgaa 
tttaaagttt tttatggcgg ttcaaaattt 
cttgaatttt ggtcatttgt ccgtgtcaca 
tcatccttta ttgcacattt ggctagttta 
catttaaagc caataggtaa ccaaccaaaa 
ttccgacact acttgaataa ccccataagt 
cggctttaaa tgtaccttat taatcaacaa 
aataaaggat gataattcca taaaaagtcg 
ccaaaattca agtttttagc taaaatcagt 
aaaaaatttt tggaatgctt ttggcaagtt 
caaaaattag aattttcaat taaaaattca 
cgtacgttgc attccgtcga cgtgccgaga 
atgaagattc gtatgagaag attctcaagt 
tcttcgatat gactgcccga cgagaaaagc 
agattttagc gaaacgaatg gagatgtcag 
agatcaccga aaagattcga gcagcagcaa 
aaatcaacgg atcagatgaa gtgaagaaga 
aggatttaat atcgaaagcc tggcttaaaa 
cgctctttgg acaacacagt ggaaatgttc 
cgttggcgaa tgggcgattt gcgttcaagc 
tcaccgttta caatgtgcct acagcgcctg 
cagtggcttc atcatcatcg tcaaaatcaa 
ttgaaacttt tgttcgggat tcacaggatt 
gacgaatggg acgaggtggg cgagttgtat 
acgacgaacg cacttcgaca gatccatggg 
gagatttttg aataagaatc ttaatttcac 
gtaattttgt ggtatttttt ctcgtatttt 
acctgaaatt tcgctaaaaa ccaagaaatt 
ggaggcacca ccgaatggag aacaggagaa 
gcggcgaaaa tttttgcaga atttgctgca 
ctctgaatgt gttatttcgg agcttcgttg 
tttctgcaaa aatttgcgcg cggcacggaa 
tcctttdggt ggtgcctcca gtctttttcg 
cgaaatttca ggttaaaatt atttaaaacc 
aaaccacaaa atcacagaga aagcttttgg 
ggtttttcct cgcattttca attgaaaaaa 
tttcgctaaa aacgaggaaa tttcattaca 
accgaatgca gctcagaaca ggatttacca 
caaccaccgc atgcttattt cgcatgcctc 
aacaatatca aggaaaaaac gtgcgagact 
ggctccgccc actgcattct gttttggtaa 
ggggcttcca gtcttttttg tgcattttta 
aggttaaaat tatttaaaac ccgttttttt 
aatcacagag atagcgaggc cccacgaaaa 
gctggcactg tgccaaacgc acaaaacgct 



caagcttaaa tataaaatca ggtatatttt 2820 
cagaggaaaa ttttgagtca atcaaaaata 2880 
aaaaatgaac tttcaccccc atactcccag 2940 
ttccctgatc aaattttttc tttttttaac 3000 
catttttctg cgtttcttat cagttatcat 3060 
atttgtttag ttacctccct tttttctgaa 3120 
aattttctat gaaaatgtga tttattttca 3180 
gagtatgact acgacacaga agatgaagca 3240 
cgcgttttgg aaaagatatt cgacacagtg 3300 
agcgaagatt cggtgattaa tttgcataaa 3360 
ccccccccca aaaatatcgt ttaattgcag 3420 
acgaatattg gctgtcgaag cgaacatcgg 3480 
gtggattaat tccgagagtc aggacagaat 3540 
cgaatttcgt gatgaaactt ctctaaaact 3600 
cggaaaattt acactgattt tagctaaaaa 3660 
tctgtccgaa atcgactttt tttggaatta 3720 
tctcatttaa tttcgttgat tactaaggta 3780 
actatcataa tttttctaca ctttttaatt 3840 
gaccaatttt gatagttttt ggctggttac 3900 
aattaaatga gataaactag ccaaatgtgc 3960 
attttggaca gatgtgacac gggcaaatga 4020 
gtatttgttt cgaagttttg aaccgctata 4080 
tcattacgaa attcactcat tttctatacg 4140 
ttttacagga tggacaaggt gttatcaatc 4200 
aaatgcagac tcgaaagaat cggaaaaacg 4260 
tggtacatga catgtcgaaa gctcaacagc 4320 
agaagctcgc gttgattgat atggaatcgg 4380 
attttggtgg ttctccgagt tcgttcaatg 4440 
cgttggaagt cgtgaaacca ccactggcag 4500 
ggaagaagcc gagacgaaag attgctgata 4560 
agaatgcaga aagttggaat cggccgccgt 4620 
cgacggttac aacgaagcca gttcgagagt 4680 
ggaggagagg atgtgtttat cgcgcggctc 4740 
ctacagtacc tccagtacag actcaagcag 4800 
cggatatggt gccgtcgaac atgaagttct 4860 
cagtttctcg atctcttggc tttgtacgcc 4920 
tcgatcggat gcctcgcaat cgagacgaca 4980 
ccgagtattg tgtcgcggat agttcaaggt 5040 
gagattttgg tttttttcgc tgctttttct 5100 
caattaaaaa acgggtttta aataatttta 5160 
tcattaaaaa atgcaacaaa aaaaaagact 5220 
cccaaaacca cgcccatttt tccgtgccgg 5280 
atttttcgtt ttacaaacga aacaacgaag 5340 
tttcgtttgt aaaacgaaaa attgcagcaa 5400 
aaatgggcgt agttttaggt tctcctgttc 5460 
cattcttaat gaaatttctt tgttttttag 5520 
cgtttttttt tcaattggaa atgcgaggaa 5580 
attttttcgc agctttttct gtgattttgt 5640 
aaacgggttt taaataattt tcacctgaaa 5700 
aatgcaaaaa agactggagg caccaccgaa 5760 
aaacaggatg cagtaggcgg agccaattcg 5820 
gcacgttttt tttttctctt gaaacaatgc 5880 
tgcgaaataa gcatgcggtg gttgcgaatt 5940 
attctgttct gagctgcatt ctgttttgtt 6000 
atggaatttc ttcgttttta gcgaaatttc 6060 
ttcaattgga aatgcgagga aaaaccacaa 6120 
ggggagcaga acaaaaaagg gggggggggg 6180 
ttttattctt attcaacgca cgactttgtt 6240 
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ataaccacac tccgttatta cgcatcgcgc 
cgtgcgttga atgagaataa aaaagcgttt 
ttcgcagatc cccttttcgt ggggcctcag 
ctaagaccaa taccaataaa tccttgcgcc 
aaccttccgt gctcgaaaca gttcgcttgg 
gaaatctctg tatttcgctc gcagtaatcg 
ggaatggact tcaagatgcc aacaatcatc 
gaaaaagcgg gaaacaacgt ctgaaagtga 
ttattttggg cctattttaa ttatttaatt 
agtaccaaaa cacacacaga atcggatgat 
caagttgatg aagctcaaat aactgtatca 
gataagaacg aggatgaaga agatgatgat 
gtcgtgggtg tgcatcagca ccagcagcag 
atgaatggtg gtggtggtgg tggtggagtg 
tcgccgccgc tttcgggaaa cggaagagcg 
aaggtagtga ggcttttttt ttaaatactc 
aatacgattc ttaaaaatgc gaattccctc 
tctcattttg aatggaaatt agtcaaaatt 
ttttcatttt ctcgcgattt tttccgcgtt 
aaattaaaaa tgtctggaat attgacaaat 
aaaataattt ggcccttttt attttttatt 
aaatttagaa acatttttta atttttttaa 
ctgtcattaa acttggtgtt gtcgaatttt 
tgaaaacacc gaattttata atgaaacttc 
gctcaaaaaa tgacctaaaa tttgttaaaa 
aacaattttt ttttttgaaa tcaccgtcaa 
ttttcaactc gatttttggt attttcaagt 
aaaagccgtc cattttctcg ccgtccattg 
gcaagataat tgacatttat acccaaaatt 
gccgctgcga caagtcaaat ttccaatttt 
taactttttt ttgagaagtt tttaagaagt 
ttttgagtct aataaagtaa ttttaaaaaa 
gaattttttt ttaaacttgt cttgaaaaat 
cctattttct tttttgcaga tgtgcggaac 
gagtggatca ccatcagaat cgaattcatc 
acagcatgca gttgttgttg ccaacgcggt 
tggcgtggat gatgatgatg atcaacaacc 
catcaaagag ttcgttagtt tttctttgct 
tgaaaagttt tacacggttt ttgaaaaact 
ttcgagcaag ttttattttt aaaaaattga 
aaaaaattga aattttcaga aaattctgag 
atgcaagaat acgtaggagt tttactttgc 
atatccgaaa aagaacaaaa aaaatgcaca 
cgatgtcaac ggtaacactg ctggaacgga 
tataatttga actctctgct gctgcttctg 
tttcaatcct cctgagattt tttgatggtc 
ctctctctct cccatgattc tcaaatattt 
gcctaatccc cgaccgaata atcagattcg 
aaccacccac ccaataatat gtgtctcatc 
ctgtagtatt ttattctcta aaaaaaaatc 
atctgtaata tatattttta aaaatgattc 
tgtgggtttc aagcttttga gctgtgaaaa 
tttttcacag ttgaaatatc ctattttatc 
catcgtgcgg gattcatttt tcgtcccgcg 
tctctcagtc tcttcttaat gatcttcgaa 
gaggtcgtct ttttttttcc ccacccccca 
gttttttgtg tgtggatttt tggatttttt 
tcaaaatttt ttcccatttt cccctcaata 



gctgtttagc gtgaaaatac aaaaaaacgt 6300 
tgtgcgtttg gcacagtgcc agctctcctt 6360 
agaaagctgc cataaacttt tttcttcgcg 6420 
tttaatatgc aaactatatt tttcttccag 6480 
taccgaagaa gaaaccgatg atctaagccc 6540 
gttcgcattc aacgatgatg aaactgaacg 6600 
gtggagagat acagaggtgg atgatgagct 6660 
gattttgaac gatttacctg ggaaaataga 6720 
gcagaattta ccgaaaccac gacgaatgga 6780 
agtgaagttg aacggatgga ggttgatgat 6840 
tcatcaaaag acgatggaat gaatggaaat 6900 
gatgatatgg atgtagatga acatcagact 6960 
cagcatcacc agcaaaaagt tcggcatcaa 7020 
gtaaaactga aaccgccgct gcaagaactt 7080 
gacagagcgg aaccgacgcc ggttccggca 7140 
gaaaaagaag gaaaaaatcc cacttttaaa 7200 
caaaatgaga actctgattg gccagggagc 7260 
gaaaaatccc gttttttttt taagttggat 7320 
tctgtgtcat tcctgaattt aacatttaat 7380 
tatgcttcaa attttttgcg cgggagttca 7440 
ttgcaaaaat atataaaaaa tcattttaaa 7500 
cagttatatt cgctatattg ggacggtatt 7560 
ttttattgct ttataagact caaaattgtc 7620 
ttggaaactt ctcaaaaaaa agttatgacg 7680 
tttgaaattt gacttgtcgc aacggctgga 7740 
attttgagta taaaatttaa ttattttgcg 7800 
cgatggacgg caagatttgg ttaaaaaatt 7860 
actttaaact acctaaatcg agttgaaaac 7920 
tgactgtggt tttaaaaaag ttagtttcca 7980 
aactatttta ggccattttt tgagccatca 8040 
ttcatcatga aattcggtgt tttcagacaa 8100 
ttcgacagac accaccttta tagcaatttt 8160 
cttgaaaaaa gtcgaataaa ttcccatttt 8220 
ggtgtcggac tcagatgatt ggagagagcc 8280 
aaccgaatgg ggtggctata cgccacaaga 8340 
agctgtcgct ttcaaggaaa aattgatgaa 8400 
atcgccggct agaggagcac gagatcattc 8460 
tttttttttt ttgatttttg agagcaaatt 8520 
gttgaaatta aaatttgttg agaatttgat 8580 
atttttcaga aaattctgag ttttcttttt 8640 
tagcaagaat ctttaagatc cttaatttct 8700 
tcaggaaatt ttattttttg tcagaggagt 8760 
tttctcaaaa cgcgtatttt tttttcagtt 8820 
aaaagttcat gatgccgtcg acaatcggtc 8880 
ctactgctgc tactgctgct catcgccaat 8940 
attcattgtt ttgtgcatat ctctctctct 9000 
caatgtattt acacccccac tctgtccgct 9060 
ctggaaaaat ctgcgattct ttaatattgc 9120 
atctcggtac tctcacttga gccgtgtttt 9180 
atttttaata taatatacgt acacatttat 9240 
ccccctcccc tccattcgtt gttttttttc 9300 
atctcatccc atcatcattt tctattgttt 9360 
tttttccttt ttttttcatt tttttttttt 9420 
aaacgcccgc cgccgcccaa tcccactctc 9480 
actattttta tttccctcat taacaattac 9540 
ctgtttggtg taatttttgt gttcggggag 9600 
ggattttttc aacaaaaaat tcccccgaaa 9660 
ttagtactgt tgtataaata aacttgctct 9720 
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ctctctctct ctcgaaatct cctactatta 
caaaaaacca cacaaacgac ctctctgcac 
tttctctgtt tctctttttt tctatcccct 



tttttttaaa agatttttcc aacaaaaatt 9780 
gcggtaatcc tctctctttt tgtcccccat 9840 
atacctgtga ttggaatatc 9890 



<210> 19 
<211> 2388 
<212> DNA 

<213> Caenorhabditis elegans 
<400> 19 

atggccacta cttcgaaggc gtttcgagcc cgggcgctcg actcgaaccg gtctatgact 60 
gtatactggg gccacgaact tccggaccta tcagaatgca gtgttggaaa ccgggcggtg 120 
acacaaatgc cgtctggcat ggaaaaagaa gaagaacagg aaaaacacct gcaagaagcg 180 
attgctgccc agcaagccag tacatcgggt attcagctga accatgtcat tccaactcca 240 
aaagtcgacc gagtcgaaga tcaacgctat cactccactt atcacaacaa gaataaaatg 300 
caccgttcaa agtatatcaa agttcatgcc tggcaagcac tcgaacgaga cgaacccgag 360 
tatgactacg acacagaaga tgaagcatgg ctatcagatc acactcacat tgacccgcgc 420 
gttttggaaa agatattcga cacagtggag agccattcat cggagacaca gatcgcgagc 480 
gaagattcgg tgattaattt gcataaatca ctggactcat caatcgtgta cgaaatatac 540 
gaatattggc tgtcgaagcg aacatcggct gcgacgacgt ctggttgtgt tggagtcggt 600 
ggattaattc cgagagtcag gacagaatgt cggaaggatg gacaaggtgt tatcaatccg 660 
tacgttgcat tccgtcgacg tgccgagaaa atgcagactc gaaagaatcg gaaaaacgat 720 
gaagattcgt atgagaagat tctcaagttg gtacatgaca tgtcgaaagc tcaacagctc 780 
ttcgatatga ctgcccgacg agaaaagcag aagctcgcgt tgattgatat ggaatcggag 840 
attttagcga aacgaatgga gatgtcagat tttggtggtt ctccgagttc gttcaatgag 900 
atcaccgaaa agattcgagc agcagcaacg ttggaagtcg tgaaaccacc actggcagaa 960 
atcaacggat cagatgaagt gaagaagagg aagaagccga gacgaaagat tgctgataag 1020 
gatttaatat cgaaagcctg gcttaaaaag aatgcagaaa gttggaatcg gccgccgtcg 1080 
ctctttggac aacacagtgg aaatgttccg acggttacaa cgaagccagt tcgagagtcg 1140 
ttggcgaatg ggcgatttgc gttcaagcgg aggagaggat gtgtttatcg cgcggctctc 1200 
accgtttaca atgtgcctac agcgcctgct acagtacctc cagtacagac tcaagcagca 1260 
gtggcttcat catcatcgtc aaaatcaacg gatatggtgc cgtcgaacat gaagttcttt 1320 
gaaacttttg ttcgggattc acaggattca gtttctcgat ctcttggctt tgtacgccga 1380 
cgaatgggac gaggtgggcg agttgtattc gatcggatgc ctcgcaatcg agacgacaac 1440 
gacgaacgca cttcgacaga tccatgggcc gagtattgtg tcgcggatag ttcaagaacc 1500 
ttccgtgctc gaaacagttc gcttggtacc gaagaagaaa ccgatgatct aagcccgaaa 1560 
tctctgtatt tcgctcgcag taatcggttc gcattcaacg atgatgaaac tgaacgggaa 1620 
tggacttcaa gatgccaaca atcatcgtgg agagatacag aggtggatga tgagctgaaa 1680 
aagcgggaaa caacgtctga aaaatttacc gaaaccacga cgaatggaag taccaaaaca 1740 
cacacagaat cggatgatag tgaagttgaa cggatggagg ttgatgatca agttgatgaa 1800 
gctcaaataa ctgtatcatc atcaaaagac gatggaatga atggaaatga taagaacgag 1860 
gatgaagaag atgatgatga tgatatggat gtagatgaac atcagactgt cgtgggtgtg 1920 
catcagcacc agcagcagca gcatcaccag caaaaagttc ggcatcaaat gaatggtggt 1980 
ggtggtggtg gtggagtggt aaaactgaaa ccgccgctgc aagaactttc gccgccgctt 2 040 
tcgggaaacg gaagagcgga cagagcggaa ccgacgccgg ttccggcaaa gatgtgcgga 2100 
acggtgtcgg actcagatga ttggagagag ccgagtggat caccatcaga atcgaattca 2160 
tcaaccgaat ggggtggcta tacgccacaa gaacagcatg cagttgttgt tgccaacgcg 2220 
gtagctgtcg ctttcaagga aaaattgatg aatggcgtgg atgatgatga tgatcaacaa 2280 
ccatcgccgg ctagaggagc acgagatcat tccatcaaag attcgatgtc aacggtaaca 2340 
ctgctggaac ggaaaaagtt catgatgccg tcgacaatcg gtctataa 2388 

<210> 20 
<211> 795 
<212> PRT 

<213> Caenorhabditis elegans 
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Asp 




485 






4 QO 








A Q5 
ri ^ D 




Cor Cot* 7l rp r W\ >*■ 
ocx ocx my ixxx 


"pV> o 71 T*P 


Al a 
ax a 


71 r/T 71 en 
AXy ASIA 


Cor Cor 
OCX ocx 


T .oi i 
XJCU 


Pi ..r 

biy 


inr 


PI ii 
blU 


Pi n 

blU 


500 






505 

JU J 








51 0 

Jlv 






Pin TfVi r Tien 7\ on 
UXU ASp Afap 


T.fln Cay 
XlCU OCX 


xrxu 


T.Tfn Car 

xiyS ocx 


T on T^ry 

Lieu iyx 


JrXlc 


Al a 
Ala 


Arg 


Ser 


Asn 


515 






520 






525 






7\ r*rr DT*ro 21 1 a TDVio 
Axy irXIC Ala ir lie 


71 c ti 71 on 
AoXX nop 


71 en 
nop 


vjlu Ixxx 


PI ii 7\ r/T 
blu Axy 


Pi n 
bXU 


frn 

irp 


inr 


Ser 


Arg 


5*30 




5"}5 

3 J J 






540 










/Nre Pi n Pi n Cov 
v,y5 ulu bill ocx 


Cor '"Pm 
OCX X Xp 


Arg 


ASp Ixxx 


blu VaX 


7\ on 
ASp 


Asp 


pi si 

blU 


Leu 


T i rn 

Lys 


545 


550 

J JU 






55>^ 
jjj 












ljy s ax y oiu x nx 


T 1 Vl r Car 
1 IlX OCX 


on 
ulU 


Jjys Irxxc 


Ixxx blu 


Txir 


TVir 

inr 


inr 


Asn 


Gly 




5 65 

ODD 






3 / U 








d /b 


Cot* 'PVi t* TiVC TVir 
OCX nix uy & j. iix 


XIX o XXXX 


ulU 


Cor 7A cn 
OCX A5p 


71 en Cor 
ASp OCX 


pi n 

ulU 


Val 
vai 


Pi n 

blU 


Arg 


Met 


580 






585 

J O J 








5Q0 

O J? V 






nl 1 i Val 7A ct*i A o n 
ulU Val Aop Atop 


Pi -n Va 1 
bin val 


ASp 


<^1 ii Til a 
ulu Ala 


HI ti Tl o 

bin lie 


Tnr 

lux 


Val 
Val 


Ser 


Ser 


Ser 


595 






OWL/ 






605 








ijyo Asp nop oiy 


Mor* 71 en 


pi xr 

biy 


ASIl Asp 


iiys Asn 


pi ii 

blu 


Asp 


Pi "*! 

blU 


pi,, 

blU 


Asp 


610 




61 5 

a i j 
















7\ C T"\ 71 7\ C?T*» 71 en 

Hbp ASp nop nop 


Mfif 71 en 

Pic U ASp 


Val 


71 en /Tl H 1 1 
ASp blU 


TT-i o Pi n 
WIS bill 


mr 


Vo 1 

vai 


vai 


Pi 

biy 


vai 




6^0 






OJ J 










T-7-i e nlri TJ-i e Pin 
nib oXll Jllo ulil 


m n pi -n 


PI n 

bin 


JtlXo Xlx5 


PT n Pi n 

bin bin 


Lys 


Va 1 

vai 


Arg 


HIS 


pi «-> 

Gin 




CAR 






Oju 






ODD 




Mot* Tien Pi "\r PI -tr 

i ¥ JcL Aoii uiy uiy 


pi 17 pi 


PI \r 

biy 


biy biy 


Val val 


Lys 


Leu 


Lys 


Pro 


Pro 


A" so 






O 0 D 








b /U 






T.oil Pi n Pi ii T.on 

XiC U> vaXXX ala JjcU 


Cor D>*fl 
OCi IrXO 




Leu. Ser* 


biy ash 


Pi -tr 

biy 


Arg 


Ala 


ASp 


Arg 


675 






DOU 














71 1 a PI ii Pr/"» TVir 
AX a oXU XrX vJ XXXX 


XrXU val 


irX U 


7A 1 a T.wo 

AX a JjyS 


Mot* Htro 

rieu L.ys 


Pi tr 

biy 


inr 


Vo 1 

vai 


Ser 


TV fin. 

ASp 


W J V 




U _7 J 






7 no 
/ u u 








Cor A C"t*x 7A e n Tvn 
OCX ASp nop J. ip 


71 yrr P In 
AX y blU 


PlTO 


Ca^ , » f* 1 1 ^ r 

ocx biy 


oci r'xO 


Ser 


Pi n 

blU 


Ser 


Asn 


Ser 


705 








/ 13 










720 


ocx nil uiu up 


PI \r Pi -*r 

biy biy 


Tyr 


ixxx irxO 


Pin PI n 

bin blu 


Pi n 

bin 


rllS 


Ala 


val 


tr_1 

Val 




7?5 

' Aj 






/ju 








/JO 




TTa 1 S 1 a Tien 7A 1 ?2 
VS1 Ala ASIi AX a. 


vai Ala 


VaX 


7A T = Dm <a 

Ala Flic 


iiys blu 


Lys 


Leu 


Met 


Asn 


Gly 


/ *x \J 














•*> c r\ 

750 




XT—} 1 7\ 71 71 en 
VcLX ASp ASp ASp 


ASp ASp 


Lain 


bill rxO 


Sex" Pro 


Ala 


Arg 


Gly 


TV 1 

Ala 


Arg 


755 






760 






/Do 








7\ c-n Hi cs Cor* Tl P 
nop niD OCX xxc 


Ajy o nop 


Cor 
Del 


Mot" Cor 
rlcC OCX 


Tnr* Vol 

mr vai 


inr 


Leu 


Leu 


pi 

blU 


Arg 


770 




775 






780 








Lys Lys Phe Met 


Met Pro 


Ser 


Thr He 


Gly Leu 












785 


790 






795 













<210> 21 
<211> 37007 
<212> DNA 

<213> Caenorhabditis elegans 
<400> 21 

cagctgatgt tgttgatgga aaaatgaegg 
caaaagtgea taataaataa atgtgtttct 
ctagctctaa acttgtattt atttcattct 
tttatgtttc taaattatta ttctttttta 
ggtcattttc ttgttttttt cttggtaatt 
tatctggcta ttcagtgaac aaaccatttt 
aaatggctca aaacgatgee atctggctgc 



ctgcaaagaa gecattgget gcaactgagc 60 
aggatcttct aataattttt tttctgtttt 120 
tgttctacca aattcccacg gattctaege 180 
tttatatctg cattttcttc taaaaactct 240 
ataaaaatta gtcatacaaa tcttgttaaa 300 
ccgctctaaa ttcgacccga atcaatcgaa 360 
aacccccctg tcgtctctca attttgtgta 420 
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ctctctcgca gccacgcacg cgacgcaacg 
aatttatcgc gccatttttg ttttgcctca 
aaaaacgcgc ttaatcgatt cctttttacc 
ttgaagatcg atgaattttc aagaaaatca 
aattcgattt ttgagttttt tgaagaaaat 
ttggtcctca gcatagaaaa ttcggacatg 
tatcgggatt agaacgattt tcagctcaaa 
tttgtagcat taatcttgaa cagctatatg 
tttttctctc gaagtttctc tttttgtttc 
ataaatttca atttgcagta cgagttcccc 
tgcttcaagt actcgaataa gcagacgtac 
atcaacttcg tctgcggtgg ctccacctcc 
tgcggtagtt gaggaggaga aaaagaagaa 
tccaagaact ccagtcgatc ggcgaattcc 
tcgatcgatt cgcgataaat tgtacgattt 
ataattatta gatcgcgctt cgcgtttctg 
atagcagatt tatcgaattt ttagcttaaa 
aaaccttttg taaaacagtg aaaatcgaat 
tccactggtt gtggaatggt ttgaatttga 
aatatttttc tattaaaaat cggtttcaaa 
tgagaaaata cgaaaaatcc agctaacttc 
ccagtggacg aaaaaagttc attttagtca 
tgcaaaaaaa aaatattttt taaagctaaa 
tttcagtgga aaggcaaaat accgcgaagc 
ttcgagaaga gcgtgtatta tttcattgtt 
agggttctga gcagcggtcc agttcgtcaa 
caacggatac agccagttgt cgatcaaatt 
aaaaattgca ccacaaatca attattctaa 
tggaagatat tctgaaagat cctcgattcg 
caccaacacc tgcacctcct cctccaatcc 
ttgaggattc agagggctca aatacggctc 
gagagacgaa tgtggaaaga gccgccaaaa 
aaaaatgaat gtttatatat tcactgcaac 
tttctaggcc atggccgagg tgccgacaag 
ccgcctgttt tctttcgttt ttcatcgatt 
ataaatattt tttgcagatg ctaaaacaat 
gcaagcagcg gtgaaagtgg tcaatgcaat 
ctttttctga aacatgatac atacgctgct 
gagaccgctg aaaaagtttt gaggttttca 
attttcgcac aaaaagttga attctgaaaa 
aaaatcaggt aatttcagca tcatatgtat 
tcccgtaatc catcatattg cattgaccac 
attttttact tggaaattgt tttagcatct 
agaaaaaacg aaaaaaatcg gtgaaaaacg 
aaatggccgc tgaaacttgt cggcccctcg 
cctcgtgagg aaaaagttgc agtgttattg 
caaaggcgca tggatttatt cagccctaaa 
gagttcggaa aatgaggatt ttactttaaa 
ccacaaaaga aaaacggaaa aaaattcatc 
gaaatttcaa cgctcgctaa tattcctaat 
tgtagaattg catccgcgct gtttccttcc 
aaatgatgaa aaaatgagac aaaactagaa 
atcatggatg cagcagatct acggagtgcg 
aggaatatta gcgagagttg aaatttcaac 
gtattttttt tcgtttttct tttgtagtaa 
aaagtaaaat cctcattttc cgaactccac 
tttgcttttt taggcccaaa ttggtccaaa 
aaatttatcc aaaaaatgtt atggcggttc 



cactcgcgtc gcggtcgcag ttctttttca 480 
tatttatcgg ctcacgattg attttcgtcg 540 
tgaaaaatgt tgttccaatt ggaaaaccag 600 
ttcaaatagg caaaacccgc tgaactttga 660 
ataattattt catcatttat gttggtcctg 720 
acattagaaa ttcataataa ctgctcccaa 780 
atatggaaaa ttggttacat aaaccgcata 840 
gcattaaaaa aaaatatata tatacattgt 900 
taaaatccgg aatataattt aaaaaaccac 960 
ccgaatcaca atgccggcaa caccggtgcg 1020 
atcatcaaga tcagtggctg atgatcagcc 1080 
ttcacccatt gccatagaaa ctgatgaaga 1140 
aaagacatca gatgatttgg aaattatcac 1200 
ctacatttgc tcgattcttt tgactgaaaa 1260 
tttaaattta attactttcc tcaaatccga 1320 
catccgcggt attttgcctt cccactgaaa 1380 
aaaaaaatgt tttttctgca tttttcaaac 1440 
ttcaaatgac taaaatgaat tttttttttg 1500 
agaaatcagc gggatttttc gtattttctg 1560 
ccattttttg acttttgaat agaaaaatat 1620 
cagcttgttc aaattcaaac cattccacaa 1680 
tttgaaattc gatttggttt gtttgaaaaa 1740 
aatttgataa atctgaaaaa aatctgctat 1800 
gcagcaagcg cgctctaata attattccgc 1860 
acatttcaaa attatgaatt aatgtttttc 1920 
gaagatcacg aagaacagat tgctcgagct 1980 
caacgagtcg agcaaatgta tgtgaagctg 2040 
tcttgtttta cagcatactc aatggttcag 2100 
cagtaatggc agatctcaca aaagaaccac 2160 
agaagacaat gcaaccgatt gaggtgaaaa 2220 
aaccgagtgt tctgcccagt tgtggaggag 2280 
gagtgagttt tgaagataga ttggtgtgta 2340 
tttttcctca cgagggacga ggaaaagtgg 2400 
tttcagcggc catttatctt gctttgtttt 2460 
tttttcgttt tttcttaata aaactgataa 2520 
ttccaagtaa aaaaattatg tattcagtgg 2580 
atgatggatt acgggaatac aaaacctaaa 2640 
taaatgctga gactacctga ttttcataac 2700 
aaattcaaat tttttggtga aaaagtcgag 2760 
cctcaaattt ttttcagcgg tctcgttatg 2820 
catgtttcaa aaaaagttta ggttttgtat 2880 
tttcaccgct gcttgcccac tgaatacatg 2940 
gcaaaaaata tttatttatc agttttatta 3000 
aaagaaaaca ggcggaaaac aaagcaagat 3060 
gccatggcct agaaaccact tttcctcgtc 3120 
taaatctcac aagagtctgg catgatttct 3180 
attaaataaa tccatacgac tttaaaggtg 3240 
atgctcaaac tagtcccaaa tgccgaatta 3300 
aagtttgaaa aaaatgcgga tgattttgtt 3360 
ttgaaccgcg cttttgtccg cgccgcactc 3420 
tcttccggcg ccctacttct tttcgattgg 3480 
ttcacgtagc gcgtcggaaa tgatgaaaat 3540 
gcgcggacaa acggcgcggt aattcaaatg 3600 
aaaatcagcc gcattttttt caaacttaat 3660 
ttcggcattt ggggctagtg taagcatttt 3720 
ctttaaaggt ggagtaccga aatttgagac 3780 
actaccgaat tttgtaatga gacgttctga 3840 
aaagttcggc aaaatagggc ccattttcag 3900 
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ctaaaatcaa attttttttt ccaacttttt 
atttattaat cactttttaa taaatattgt 
taagtacatt tatggttttt ggggcacaaa 
taaatgtact taaatcagcg aataaacgcc 
tgatgaataa ataaaaatta ggttccagac 
ttgattttag ctgaaaatgt gccttatttt 
ttgagaaaga aattttcaga acgtctcatt 
gtctaaaaag tttcaaattc caataaaaca 
attcctaaac gtattataat ccattctcaa 
atcgccgagc tccgtaagaa cggcttatgg 
cctgaacgta ataaaacgca ttgggattat 
gatttccgaa ccgagacgaa tacgaagcga 
gcgaaacagc accgcgacaa gcagatcgag 
gagaagcgaa aaatgtgtgc aggaatcgcg 
gataaagttg tggatattcg agcgaaggaa 
aataagcatt tgatgtttgt aattggacaa 
ggacttgttt catcgtcgaa atccccatca 
gaattcaaag cacctggctc tgattcagaa 
gaaaagtcac agaaaaagga agatgttcga 
actgtggata tggatgactt tttgtacact 
ctgacgcagg aggatttgga ggagatgaag 
aaggaagctt gtggtgataa tgaggagaaa 
ctaaaaaaat tacctaaaaa aaatcgattt 
gaaacgtcat ggcggctcga aattttgaaa 
ctcaaatttt attgcatatt ttggtagttc 
aaaatgtacc tgaatctgca agtaaacgac 
ttccgatact gaaatgtggg cgaaatttga 
ccaaaacccg gctttaccgc acgaaggttt 
tcaaatttcg tccacttttc agtcagaaat 
ttgcatattt tcgtcgttta ttcgttgatc 
attcggtaat tgtgcatcca tcggctgaaa 
aatttaagat tttagattga aataagccgt 
ttaaattcaa tttaaatccc ctctttattt 
ttccacctca agctcagatc tcaccgccga 
caacggtgat ggtcatggtg tacttgaaaa 
tagtgatgaa cgacaacaag agttggcgaa 
aaaaggatat acacttgaga cgacacaagt 
acaactgaga gaatatcaaa tggttggatt 
tttgaatgga attcttgccg acgagatggg 
gctggctcat atggcttgta gtgaatcgat 
gtctgtcatt ctgaattggg agatggagtt 
gacgtatttt ggtacggcga aggagcgtgc 
ttgtttccat gtgtgcatca catcatacaa 
gcagagggtg cgtagaaatt ttgaagattt 
tttaaaacca attttaccga taattgcgaa 
tgctataatt agtataattt ttgcaaaaat 
taaaacattt ttgaacaatt tttaagaggt 
ttttggcgat atgaatcgcc cgaaaatgtc 
ttaaaaaaaa atggcccaaa attgtctcaa 
gaaatctcaa aatttgccaa attttccgtc 
tttttgtggt ttattttagc gttatttcgt 
caaaaattat actaattata gcaatttctg 
acttggtata aatggttttt ttccaaattt 
atttgaggct ttgttttttt ttttggaccc 
tgagacgctc tgaaaatttc tttctcaaaa 
aaaataaggc ccattttcag ctaaaatcaa 
gcctggaacc taatttttat ttattcatca 
ggcttttatt cgttgattta agtacattta 



cggtgtcgca acgtctggag cctaattttt 3960 
agcctttgat taggcgttta ttcgctgatt 4020 
taaaagtttc attttatgcc ccaaaaacca 4080 
taatcaaagg ctacaatatt tattaaagag 4140 
gttgcgacac cgaaaaagtt ggaaaaaatt 4200 
gccgcgaact ttgaaccgcc ataacttttt 4260 
acgaaattcg gtagttttaa accaatttgg 4320 
taccaaagtc ttgtgaaatt acaataaact 4380 
ttcttgcagg aagcgcatgt attggctcga 4440 
tcgaacagtc gtctgccaaa gtgcgtcgaa 4500 
ctactggaag aggtcaaatg gatggcagtt 4560 
aaaatcgcca aagttatagc tcacgccatt 4620 
attgagagag ccgccgaacg ggagatcaag 4680 
aagatggtac gggatttctg gtcgtctacg 4740 
gttctggagt cgaggctcag gaaggcgaga 4800 
gtcgatgaaa tgagcaatat tgtgcaagaa 4860 
attgcatcgg atcgagatga taaagatgaa 4920 
tctgacgatg agcagacaat tgcaaacgcg 4980 
caggaagttg atgctcttca aaacgaggca 5040 
ttaccgccgg aatatctgaa ggcttatggt 5100 
cgcgagaaat tggaggagca gaaggctcgg 5160 
atggagattg atgaagttcg taggatgctc 5220 
tccctggaaa aaatcctctg gaaatgaccc 5280 
aaaaaaaccc cccaaatttc cagctaaaat 5340 
ttttgttgtc cgaggtgcgt ttttcagctg 5400 
caatatatgc aataaatgat gataattaat 5460 
gatttcgact gaaaacgtct taaaaatcac 5520 
gaagaaaatg gccaattttt agccaaaatc 5580 
tagttttttg aaattaatta acacctttta 5640 
gaggtgcttt ttcggtcgat gggtgcacaa 5700 
atgctccaga atttgcgaat gaacggtgaa 5760 
tttttagaga aaattggtcg ttttgagaca 5820 
tcagagccca tcatcagatg ctcaaaagcc 5880 
gcagcttcaa gatccaacag ctgaagacgg 5940 
cgtggattac gtgaagctca acagtcagga 6000 
tatcgcagaa gaagcgctga aattccagcc 6060 
caagacgccc gtaccattcc tgattcgagg 6120 
ggattggatg gttacacttt atgagaagaa 6180 
cctgggaaag acgattcaaa cgatttccct 6240 
ttggggacca cacttgattg ttgtgccgac 6300 
caagaaatgg tgtccggctc tgaagatttt 6360 
cgagaagcgg aagggatgga tgaagccgaa 6420 
gacggttact caagatatta gagcttttaa 6480 
gcggcgaatt tggcgaattt gcataatttt 6540 
atttttcaat tttatacagt ggtcggaaat 6600 
tggtactttt ttcgaaattt tgaaccacca 6660 
ttaataacga aattcgttca tttgaacaca 6720 
ccccaataga cctaatttct taacaaaaat 6780 
aatttcgaaa aaaaaaccgt aatttcagct 6840 
tcacggagat cagaaaaagt tttttgcatt 6900 
taatttagat acattttagc ccaatttttg 6960 
acccctgaca aactttgaaa ttatcggtaa 7020 
ttaaagcgat attaaaggtg gagtaccaca 7080 
aaattggtcc aaaactaccg aatttcgtaa 7140 
aaaaagttac ggcggttcaa agttcgcggc 7200 
aattttttcc caacttctcg gtgtctcaac 7260 
ctttttaata aatattgtgg tctttgattg 7320 
tggtcagtgg ggcacaaaat gtaacttttt 7380 
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ttcccaaaga ccataaatgt actttaatca 
atttatttaa aagtaatgaa taaataataa 
ttggaaaatt tttttatttt agctgaataa 
ataacttttt tttgagaacg tctcgttacg 
taaaaaaaca aagtctcaaa tttcttgtta 
tcaggcctgg cagtacctaa ttctcgatga 
acgttggcag gctcttctga atgtccgtgc 
acttcagaac tctctaatgg aactgtggtc 
ctcaagtcat gatgatttca aggattggtt 
aaatatggaa ttcaatgctc cactaatcgg 
tctgcggcgg ctcaagaagg aagttgagaa 
gaattgttcg ttgtcaaagc ggcagagata 
aacaaaggag aatctaaagt ctggaaatat 
ccgaaaatgt tgtaatcatc cgaatctctt 
cgttgagaag cttcagctcg atgttccggc 
ctcctcctcc tcagctagtc aaattccgga 
atcttccgtt cgatctgcaa aaccactcat 
ggagccacga gcaccagaag ttggcggatt 
gaatccgcat acggaagagt cggaggacga 
tttttaggaa aattgagaaa atgatctaat 
caagccgatt tgccggaaat tttgattttt 
gatttgccag aaattttgat ttttggcaat 
gcgatttgcc agaaattttg atttttggca 
tggcaatttt ccgatttgcc ggaaattttg 
ttttgatttt tggcaatttg ccgaattgcc 
aattttgatt tttggcaatt tgcctatttg 
atttgtcgga aattttgatt tttggcaatt 
caattttccg atttgccaaa aattttgatt 
attttgtgag ccaattttct cgaaatttgg 
tccactgatt ccgaatatct aagtaaaaaa 
atcgctaatt ttcgcgtcag agacgacgtc 
gtcggatgct aatttccgtt tttcaacgag 
gttttctttg aaaatatgtt cttaaggtca 
caaaattggc tttaaaaaca cattttcaca 
aggaaacccc ccgtttgaaa acagaaatta 
cgacacatgg cgtctggcgt ctctggcacg 
aaaagaggga tatatttttt tacgaatttt 
atttggagaa atttgagaaa atttctcaga 
tcttatatta aaaaaaaatt aacttttata 
gctcaaccac ttcaaaatgg aaattcaata 
tcatgcattc gttcaaaaac cgtcgtaaat 
agtggttttc attttaatat ggccaatgtt 
gcacgtatga gcccaccgct caaacgtcag 
gattatgttc cgcgacacgt tgttgaaaag 
attgttcgaa ggcgatttga gatgattcgt 
ctggttcgag aggaaattat tgcagaattt 
gtgcaggaga ggcttttgga gtattgcgag 
tattactttg ctttttttta aaccaaaatt 
cattctgaaa gcttctcaaa aaaaaagttt 
cattttcagc tgaaatcaaa attttttcca 
aattttggaa aacgagaaat tttccatttt 
cctcaaaatt ggacaaacaa aaaaattttt 
ttctataatt tttcgatttt ttaaataaaa 
tctcgatttt aacttttttc aaaaaaaaat 
tctaattctt ttttagaccc aaattggtcc 
tgaacatttc tcaaaaaaaa gttatgacgg 
catataaaat caaatttttt ttctaacttc 
tatttaatta ttacttttca ataaatattg 



acgaataaac gcccaatcaa agaccacaat 7440 
ttaggttcca gacgttgcga caccgagaag 7500 
gggccttatt gtctcaaact ttgaaccgcc 7560 
aaattcggta gttttggacc aatttgggtc 7620 
gagatttttt aaaaattgat attttttttt 7680 
agctcaaaat atcaaaaact ggaagtccca 7740 
tcgacgtcgc cttctcctga ccggaactcc 7800 
gttgatgcat tttttgatgc caacaatatt 7860 
ctcgaatccg ttgacaggga tgatggaagg 7920 
acgacttcac aaagtgctcc gtccgtttat 7980 
gcagctgcca gagaagactg agcatattgt 8040 
cctgtacgat gactttatga gtcgtagatc 8100 
gatgtcggtg ctcaacattg tgatgcaact 8160 
cgagccgcgg ccagttgttg ctccgttcgt 8220 
tcgtctcttt gaaatttcgc agcaagatcc 8280 
aattttcaat ttatccaaaa tcggctatca 8340 
cgaagagctt gaagcaatga gcacttatcc 8400 
tcggttcaat cggacggctt ttgttgcaaa 8460 
aggtgttatg agaagtcgtg ttctggtgaa 8520 
tgttgaattt tttaaagaat ttatgggcca 8580 
ggcgatttgc cgaaaatttt gatttttggc 8640 
tatccgattt gccggaaatt ttgatttttg 8700 
attatccgat ttgccggaaa ttttgaattt 8760 
atttttggca atttgccgaa ttgccggaaa 8820 
ggaaattttg atttttgggg atttgccgga 8880 
tcggaaattt tgatttttgg caatttgccg 8940 
tgccgatttg ccggaaattt tgatttttgg 9000 
tttggcgatt tgccgatttg ccggaaaaac 9060 
gcttcaatat tttcaaatta ttccaaattt 9120 
aaattccctg attttatatt tcagcttaaa 9180 
atgtgtcgat ttactggatt tttaatcttt 9240 
tttccttcat ttccatcggt ttttgacgaa 93 00 
attaaacgtt ttattatcaa aaaaaactag 9360 
gaaaactccg acaaaaaccg acgaaaatga 9420 
gcatctgata aagattaaaa tcccgtaaat 9480 
aaaagtcgcg attttaagct gacatacaaa 9540 
tcacatagat attcgaaatc aggggggaaa 9600 
tttcggatta aaaatattca atttttgttt 9660 
atttttcagc caaaaccaat taatggaaca 9720 
ccacaaaatg ctccaaatcg tccacaaact 9780 
acagttccac tgaccatctc caccgatcga 9840 
ggaagaggtg ttgttcgttt ggatgattca 9900 
aagctcaccg gaactgcaac gaattggagt 9960 
atggaagaat cgagaaaaaa ccagctggaa 10020 
gctccgatta ttccactgga aatggttgcg 10080 
ccacgtttgg ctgtggaaga ggacgaggtt 10140 
ttgttggtgc aaaggtagaa ttttgaaaat 10200 
ggcccaaaac taccgaattt cgtaatgaga 10260 
tggccgctca aagttcggga aaataaggcc 10320 
acttctcggt gtcgcaacgt ctggaactaa 10380 
ttgcaagctg aaaaatcaaa gttttttttt 10440 
ttttgaaaat tgatcgaaaa aattcaaaat 10500 
ctttcatcat ttttcttcca aatttagttt 10560 
tttttaatac gaaaaaaatt caattttagc 10620 
aaaactaccg aatttcgtaa tgagacgttc 10680 
ttcaaagttc ggcaaaataa ggcccatttt 10740 
tcggtgtcac aacgtctgga acttaatttt 10800 
tggtctttta ttaggcgttt atttgttgat 10860 
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ttaagtacat ttatggtcaa gtggggccca 
cataaatgta cttaaatcaa cgaataaacg 
agtgttgaat aaataaaaat taggttccag 
tttgatttta gctgaaaatg ggccttattt 
ttgagaaatt ttcagaacgt ctcattacga 
aaaaaagaat tagagctaaa attgaatttt 
agtaaaaatc gagaaaacta aatttggaag 
gaaaaattat agaaattttg atcgattttt 
ttgtccaatt ttgaggaaaa aaaaaacttt 
tctcgttttc caattttttg atgtggattt 
aatagattat tatctaaaaa tcgaaaaaat 
ttaagaaatt gtttttccat taaaggtgga 
acccaaaatg gtccaaaact accgaatttc 
aaaaagttgt gaccgctcaa agttttggaa 
ttttggcaac ttatcggtgt cgcagcggtt 
ttaatgcatg ttttggcatt tcattatgtg 
ctgcatcgac caaaaaacca tctcaatcaa 
tgcattaaag gatgataatc aaataaaaat 
tgccaaaatt tgagatttta gctaaaaatg 
aacttttttt ttgagaaatt ttcagagcgt 
atttgggtct aaaaaagcag cgtctcaaaa 
taaagtataa attatccaat caaaaattga 
gaaaaaaaaa ttaattttaa tttttgttag 
cgatgcttgg cagtgtcgtc catcatcgtc 
atcaaatatc gagctgaatt ctcgttctct 
ccgaatgtcg atctcacgtg ctcttcaatt 
tggaaagctt cagacgttgg ctgttctgct 
tctgatcttc acgcaaatgt caaagatgct 
cggttatcag tatttccgcc tcgacggtac 
ggagcggttc aacgcggatc ccaaggtgtt 
tgttggagtc aatctaaccg gtgctgacac 
gacgatggat gctcaggctc aggatagatg 
gatttatcga ttgatttccg agcgaacaat 
gaagcggcga cttggagagt tggcaattga 
acaatctgac agtattcggg atctttttga 
agatgttgcg acgacgatga gcgagaaaga 
tgaagctgat gtgaatgcgg cgaagattgc 
gtttgatgag aaatcattgc cgccgatgag 
gaagtatatg gagttgatac aacaggtaaa 
agaatatcaa attttgcccg attgtgtcgt 
tttgagggaa aatcggaaaa atgttcagaa 
tcttagcgca aatgtcttct aaaaaataaa 
aaaaaccaaa gaaaaaattt agatttttcg 
aaattgtcga aaatgaatga aatttttaat 
ttaaaaaatg tgatcatttc ggtaggaaaa 
aaccgagcct ctacaatctt tttttttccc 
tatataaatt tcaaaatttc agctcaaacc 
gacacagtac aagccagaat ttgaggaaga 
tcatctgact tttttttttt ttttttaaat 
tatcgaccaa aaacgcgaag aatgggacaa 
cgacgattcg gatagtctgc tgctcaacga 
_c.tcaagtctt ttagacgagg tacgcgatcg 
agccgctcaa aaaccggcaa aaaagcctca 
taagcgtaaa tctcaggctc cttccttcga 
cgcttctccc ccggattccc cgcgtaagag 
tggtggtggt ggtggtggtg gtagtagatc 
gaaagaagaa tcagatgatg atgatgagga 
tccggcagaa aaggtcccgc cgaaaagaaa 



aataaaagtt acattttgtg cccacatgac 10920 
cctaatcaaa ggccacaata tttattaaaa 10980 
acattgtgac accgagaagt taaaaaaaat 11040 
tgctgaactt taaaccgcta taactttttt 11100 
aattcggtag ttttggacca atttgggtct 11160 
cttcgtatta aaaatttttt ttttgaaaaa 11220 
aaaaatgatg aaaattttat ttaaaaaatc 11280 
tcgatcaatt ttcaataaaa aattttttgt 11340 
gatttttcag cttacaaaaa atggaaagtt 11400 
ttatgagaaa aaatatataa tgtcacaaaa 11460 
taaattttcc agttttcagg aaaaaaatcg 11520 
gtaccgaatt ttgagacgct gcttttttag' 11580 
gtaatgatac gctctgaaaa attttcaaaa 11640 
aaatggcata tttttagcta aaatctcaaa 11700 
ggaacttaat ttttatttaa ttgtcattca 11760 
ttatttcgtt gattgagatg ctttttgtgc 11820 
cgaaataaca cataataaaa tgccaaaata 11880 
taagtttcaa ccgctgcgac accgctaagt 11940 
gtccattttt ctaaaacttt gagcggtcac 12000 
ctcattacga aaattggtag gttcggacca 12060 
ttcggtactt cacctttaaa gttttcaatt 12120 
cgaaaaaatt ttttaaaaat tttttcttcc 12180 
attcggaatg tacgtcgaac cagtgctgac 12240 
tggtcttcca tcatatattc gcaacaattt 12300 
tctcctcaac acctccacta atttcgatac 12360 
cccagaactc cgtctgatcg agtacgattg 12420 
tcgtcagttg tacctgtaca agcacagatg 12480 
cgacgttctg cagaccttcc tttctcatca 12540 
cactggtgtc gaacaaagac aggcgatgat 12600 
ttgcttcatt ctgtcgacga gatccggtgg 12660 
tgtgatcttc tacgattcgg attggaatcc 12720 
tcatcgtatc ggacagacga ggaatgtctc 12780 
tgaggagaat attctgagaa aggcaacaca 12840 
cgaggctggc ttcacacccg agttcttcaa 12900 
tggagagaat gtggaagtga ctgctgtggc 12960 
aatggaggtt gcgatggcaa agtgtgaaga 13020 
ggtggccgag gcgaacgttg ataatgcgga 13 080 
caatttgcaa ggagatgagg aggctgatga 13140 
attcggcgga aatcggaaat tttcccattt 13200 
tttttgattt ttcgatttat tcgatttgtt 13260 
aattaaccat aacatgtgat ctttttaaaa 13320 
gaatgaccaa aaattttaag ctaatttttg 13380 
atgttttccg agacaaaaag acaaaaacgg 13440 
ttttcagcaa aaaaaaaata gtacttaatt 13500 
tctggaaaaa tcgattttca aacaaaaaaa 13560 
gaaatctcca gaacttctca caataacaac 13620 
aatcgaacga tatgccatta actttcttga 13680 
atgcaaagag gcagaggtat attattccat 13740 
ttaaatttca ccaaattaat tacaggctct 13800 
aaatctcaac gataccgccg tcattgacct 13860 
tccttcgact tctgccgatt tttatcagag 13920 
tcgtcgtcgc agcagcagcc ttctccaaaa 13980 
aaacttccaa attcgtgctc gctccccgtc 14040 
tccatatgtt tcgtacgcac cgcacgcgct 14100 
aagatcacgt ggtgcgcgta gtttaggtag 14160 
tgttggaaga cctgcccgcc gatcagtgaa 14220 
ttattgccaa gaagaggaag tgaagcgaaa 14280 
acgagttgtg tttgtggaac ctccagaggt 14340 
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gaagccgccg gagccgaaaa aacgagttgt 
tctaactact cttccacaac aaggaccgct 
acctcggccc caacaacaag caccaccaca 
gcctgtgaag gtgctcaaga ttagtggtgg 
atcgccaggt ccatcaatcc tccgaagaac 
tggacgccta ccgcttgtca gaatgcctgt 
tgctccaccg ctgagaagtg gtgttgctcc 
cgtcgttccg tcgtcgagag ttcgagttat 
ggtgcaacaa caacaaagcc cgagcccgtt 
cgggccatct ggaccaccac cacctggacc 
gaagccgaga ttctcacttg gatcacgaag 
ggcgccacca cagccaccac cacccaccac 
aggattttcc tttttttttt gttgattttt 
atctcatttt gctttaatat ttccattttt 
tcgggaaaaa acgaaaaatt tgaacttttt 
atgaaaaaat cggaataatt cagatttttc 
tcagtttttt ttttcaaaaa atcgaaaaaa 
ttttgcgatt ttttgaaaaa tttcaatttt 
attcagaaaa ttggaaaaat taagattttt 
tttagctttt ttcaaaaaat cgaaaatcgg 
cgaagaaatt ccaaaacttt gcgttttttc 
tcaaaattcg ccattttttg cgaatttttg 
aaatcttaat aattcagatt tttcgatttt 
attatttatt tcaaaaactc taaaaatttt 
aaacccattt ttaaatcaaa aaatcggaaa 
aaaaaaattc cacacagcaa aaaataaact 
tttaattctt atcacgacgt caaaattcgg 
gagcggttct ttttttcatg agttctccca 
tttgttgata agtttcaact tcttcttctt 
ttttgcctgt tttctgccga ttttttgaca 
atgcacgtcg ttggctctag ctttggcaag 
ttttttgcag aatttttcag agataggggg 
cgcatttttt ccaaaaattt ttgtattttc 
taccaaattt ttctcgccac ttttggctta 
ttgcgaaagt ggcgagaaat ctcactggtt 
gaaaaaaaaa aaaaaaaaaa aaaactagac 
cgagtcgatg gtggattttt cttgaaacga 
tggcgaaaaa tggtgagaaa tgacgaggag 
aaattgtgtg gaagtctcgg gaagaaatta 
ttttgtgtga aattttttta aatctgtaga 
tacgtgttta tccacaaaga tgagaaaaat 
ggaagagaca aaactactgt agtttttaac 
catcgaattg aatttaattt tcaggcgttt 
tattgaatca atcttaaaag aaaacacaaa 
taaattcaat tcgatgacgc aatagctcgt 
tagttttgtc tcttcccgcg ggttcatttg 
tggataaaca cgtaataaca tttctcggca 
ctttgaagag tactgcaatt tcaaacacgg 
aaaggtgcac accttctcga atttctcttc 
ttttaaaaaa tcaaaaaaaa aattaccttt 
tgtttttttc ggcccaaaat ggtccaaaac 
atttctcaaa aaacaacgtt atggcggttt 
ctaaaatcaa aattttttcc cagcttctcg 
ttattcatca ctttttgata aatattgtgg 
aagcttattt atggtctttg tggcgttaca 
aaatcaacga ataaacgcct aatcaaaggc 
taaaaattag gttccagacg ttgcgacacc 
aaaaataagc cattttccca aaactttgag 



tgttcctgct ccatcatcat catcatcagc 14400 
gatttcgttg ccaaaagctg tgccagttgt 14460 
gctcatcaaa aagcaccagc agactctgat 14520 
tggtggtggt actccaggac catccagtgt 14580 
cgttgttcca ggcataggcg ctggtggtgt 14640 
tcgccctcca tttcctggct cgcaagctcc 14700 
aacagctcct gcagcagctc cacgccagtt 14760 
cacgacgaga actccggtcg ccaccaccat 14820 
gatgtttcca gtccgggttg tgcaaaggcc 14880 
tccagatcgc ccaggatttg gaatctatga 14940 
aagccgtgga gattcgggcc cggaagatcc 15000 
ttctaggcca ccgccacaag cctaggcgct 15060 
gctctttttt tgctctctca tgattttata 15120 
ttggatgtgt ggaatttttt tttttgaaaa 15180 
ggtgattttc agagaaaaat ccgtttttaa 15240 
gaaaaaaaaa accgagaaaa tttcaaattt 15300 
aaagtaaatt ttcagaatta tcagccaagt 15360 
tggcaatttt tgggaaaaaa tcaattttta 15420 
cgaa.aaaaaa aacgaagaaa gtttcaaatt 15480 
aattttttta atttttcgaa taaaaaaaat 15540 
ttgaaattat ctgaaaaccg gaattttttt 15600 
taatcttttt ccgagaaaac tcgatttttt 15660 
cttttgttcc aaaaagtcaa aaaccgaaca 15720 
caattttttg gaaattttcg ggtataaaaa 15780 
tttttgtgat ttttcgattt ttttcactcc 15840 
ccgcgcattt ttgagcgcac ctttcaatgt 15900 
ttatttttca cacacacaca ttttcctccc 15960 
tgttttgttt ttatatttga gacatttttt 16020 
cttctgacta taaacgtttt tctccatgtt 16080 
cccaaaattt tttttcattt tcgctcgaaa 16140 
tttttaacac tgattttctg gttttttttt 16200 
ctcattccag cagggtttcc cactatattt 16260 
aaaaatttcc aaaaagaaag gggttttctt 16320 
attttggctt tagagattcg atcgaaaaaa 16380 
tgatgtttga ccccctacta tagaaaattt 16440 
gaaatttgtg gaaatcttgc tggagtttga 16500 
atgaaacggt gattttggat cggagaaata 16560 
gaggaagaag ctgaaaatct ggaggaacaa 16620 
gaattgaaat tttaaagtgt tctgagaatt 16680 
tcaaatatca aaaaaaaaaa tcagaactat 16740 
cgccatatct ggcgcgcaaa tgaacccgcg 16800 
caatttgtgt agatttacga gctattgcgt 16860 
cacacgtttt tatattgaaa tttatctatt 16920 
aaattttttt taaaaattgc ggctcaaaat 16980 
aaatctacac aaattggtta aaaactacag 17040 
cgcgccagat atggtgattt ttctcatctc 17100 
caataaattt ttgctgaaac aagtgcgcgc 17160 
ttttttggtt ggaaagcaca gtactttttc 17220 
gtgtcgagac caagaatgcc atttttcgat 17280 
ttaaaggtgg agtaccgaaa tttgagactt 17340 
taccgaattt cgtaatgaga cgttctgaaa 17400 
aaagt-tcagc aaaataaggc ccattttcag 17460 
gtgtcacaac gcctggaacc taatttttat 17520 
tcttttatta ggcgtttatt ttattgattt 17580 
ttttgtaccc taaaaaccat aaatgtactt 17640 
tacaatattt agtagaaagt gataaataaa 17700 
gagaagttgg cgaaaacttt gattttagct 17760 
cggtcataac ttttttttga gaaagaaatt 17820 
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ttcagaatgt ctcattacga aattcggtag 
gtctcaaatt tcagcactcc aactttagcc 
tatactttat ttttggccga ctttttgaac 
caaaactaac atattatcca gacgcgaaat 
cggtaacagg tttcggcacg atacattttt 
ctaataattt tcaactttcg tttctgttgg 
gatgaaacaa aaaattaaca caaaatcgtc 
tgcaacaaac ccggaaaatt aaagtagcat 
tttgattttc ggcaatatac cgatttgccg 
tgccggaatt tttggttttc ggaaatttgc 
tttgccggaa attttgattt tcggcaatat 
aatataccga tttgccggaa catttgattt 
gatttccggc aatatgccga tttgccggaa 
cggaacattt ggtttccggc aatatgccga 
gccggaaatt tagaattccg gcaatttgcc 
atgccgattt gccggaaatt ttggttttcg 
atatgccgat ttgccggaaa ttttgatttc 
gtttgtcacc cacacgtgta ttgatttgat 
acgatatcat gccaatctgg cttccaccat 
tgagaatgga agatgattgt ctcgatctga 
gcctaccaca agtttgtcat gaaatgagac 
acacgttgaa tgcgtttaat ggtaatattt 
ttaaattcga ttttgagcaa tttttatcgt 
aagatttttg ttaaattgaa aaaaagagat 
aaattgaaat tacagtactc tgtttaaagg 
agtacagtaa ttagttaaat aagactactg 
ttttatatga atttttgaaa actagaaaca 
ctcccgccaa gtgattttgt tcgacggggc 
tcaatttttt ttgcttaatt ctcaccgatt 
atttggagac aatatcaaca taaatgcttt 
tttctccgaa tttccatcaa aattaaactg 
aaaaaattga gttaaatgaa aatagaaaac 
ttggcgggag gttcaaatgg gaattgtatg 
agaaagcaaa ttaaatttaa aaaatcgagc 
aaaaataaaa tgacacccaa aaaatcataa 
aaaactcaga gaacccgtat tgcacaacat 
agcgaaaaga aaactaccgt aatttaaaaa 
gatttacgag atctcgattt tctaaataaa 
atttgacttt gtttcttcgt attattttct 
attttattta aaatcaagca aaaacgagaa 
tatcgctgac ataatttaaa aaaaaaattt 
acaagtagtc atagtacagt agtcatttaa 
gatatttcat atttttattc atatttttat 
atatatattt cttggcgttc taatgcagtt 
caaaagaaaa ggaatcggtg tacgatgcgg 
aagcgatcac agcagaatct gcagcgtctc 
tggatgatac aagccaggat gcgaagattg 
ccaccaccgc cactactact actacagtac 
aatcgtcgaa aaagaagaga aatgataatc 
tcaagaaatt ttgaatttgt gaaaattcaa 
ttaattttcg ccgaattgtc aatttgccgg 
gccggaactt ttcattttcg gcaaattttc 
ttgtcgatgt gccggaaatt ttgattttcg 
cccaacaatt ttccgatttg ccggaaattt 
atttcaatcc caacaatttt ccgatttgcc 
tgccggaaat ttcaatccca gcaattttcc 
tttcgatttg ccggaacttt tcattttcgg 
ttcgccgaat tgtcgatttg cccgaacttt 



ctttgggcca ttttgggccg aaaaagcaaa 17880 
tttaccttgg tgaaattttt taatctgtag 17940 
acaaattcgg tgttagttta aaaaaacaat 18000 
ttttgtcggt tttcttcgcg ccaaaaagta 18060 
gttaaaaggt gctgctcctt tgaagagtgt 18120 
aattttcttc aatttttcat agatgttttc 18180 
gtgtcgagac ccgaaaaaat tttgcgtctg 18240 
attgatccaa attgccgatt tgccggaaat 18300 
gaacatttga ttttctggaa tataccgatt 18360 
cggaaattta gaattccggc aatatgccga 18420 
gccgatttgc cggaaatttt gattttcggc 18480 
ccggcaatat gccgatttgc cggaattttt 18540 
attttgattt tcggcaatat accgatttgc 18600 
tttgccggaa tttttggttt tcggaaattt 18660 
gatttgccgg aaattttgat ttccggcaat 18720 
gaaatttgcc ggaaatttag aattccggca 18780 
cggcaatatg ccgatttgtc agaagaaatc 18840 
ttttctagat aaaattctac gacgagctgg 18900 
caccaccaga ttcggatgcg gatttcgact 18960 
tgtatgaaat tgaacaaatg aacgaggctc 19020 
gtccgttggc tgaaaaacag cagaaacaga 19080 
tcaaaaaaaa atttttttga aaaaattcaa 19140 
gaagattgca taattttgag attttgcgcc 19200 
gtgcgccttt atggagtact gtagttttga 19260 
cgcacacatg tattacgtag cgaaaagaaa 19320 
tagcgcttgt gtcgatttac gggctctgaa 19380 
tctcaaattg cataaaatta ccatttgaac 1944 0 
gcgcttgcac gttttctatt ttaatttaat 19500 
tttcatgttt tcagtttgat tttgatggaa 19560 
tcaatcgaaa atgtgcattt atattgacat 19620 
aaaacacgaa aaatcggtga gaattaagcg 19680 
gtgcaagcgc gctccatcga acaaaatcaa 19740 
caattttcaa aaggtcgtat aaaattttga 19800 
tcgtaaatcg acacaggcgc taattttcaa 19860 
gaaaatcata aataaatatt acgggaacac 19920 
atttgacgcg. caaaatatga aatatctcgt 19980 
catttaaatg actactgtag cgcttgtgtc 20040 
ttttttaaaa aatgatgtca gcgatattcc 20100 
catttttgct tgattttatt taattttata 20160 
aataatacga agaaacggag ttaaatggaa 2022 0 
aattagaaaa tcgagatccc gtaaatcgac 20280 
ctaattactg tacttttctt ttcgctgcga 20340 
ttattttcat atttttatat atatatatat 20400 
tctctcaatt aattccagac attctatcgg 20460 
tcaacaagtg ccttcaaatg ccacaatccg 20520 
cagcatacac ggaacactca tcattctcga 20580 
agccaagttt gactgaaaat caacaaccca 20640 
cccaacaaca acaacaacag cagcagcaaa 20700 
gaacggtacg gaggttacta gcgaacaatt 20760 
ttccggcaat ttttcgattt gccggaactt 20820 
aaattttgat ttccgccgaa ttgtcgattt 20880 
gatttgccgg aacttttaat ttttgacaaa 20940 
acaatttgct gatttgccgg aaatttcaat 21000 
caatcccaac aattttccga tttgccggaa 21060 
ggaaatttca atcccaacaa ttttccgatt 21120 
gatttgccgg aaatttcaat tccggcaatt 21180 
caaagtgtcg atttgccgga acttttcatt 21240 
taatttttga caaattgtcg ttttgctgga 21300 
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aattttgatt ttcgacaatt tgccaatttg 
atttgccgga .aattttgatt ttcgacaatt 
caaattgtcg atttgccgga aattttaatt 
aattccggca atttaaaaac actaaaaacc 
tttcagcttt tctcaaaaaa ttgcgattcc 
aaccgggaaa ttcctaaatt cctatttaaa 
gctcaaaatc gaacagctga aaatggtgtg 
cgtgaagagc cagattatga tggagccgaa 
caagcagttc aagtcgaatt tgcaaatgct 
atggtgttga actgggaatt cgtgtcgaat 
tcggcccgtc aatgctcaat tcgatatcaa 
ttggtggctt ctgatccgat ttccaagaaa 
ttatctcatt tgagaaaagg acgaatgact 
atattgactg ataagaaaca tgtgaataga 
agacctgttc agttttggag aggccctaaa 
tgcaactttt tcctcacgag ggacgagaaa 
gacaagtttc agcggccatt tatcttgctt 
ccgatttttt tcgttttttc ttaataaaac 
aaaaatttcc aagtaaaaaa atcatgtatt 
tgtaatatga tggattacgg gtatacaaaa 
gctgcttaaa tgctgagact acctgatttt 
tttccaaaat tcaacttttt tggtgaaaaa 
tgaaaacctc aaaacttttt cagcggtctc 
agcatcatat gtatcatgtt tcagaaaagt 
ttacattgac cactttcacc gctgcttgcc 
ttgttttagc atctggaaaa agtatttatt 
agctgtgaaa aacaaaagaa aacaggcgga 
ttgtcggccc ctcggccatg gcctagaaac 
ttgcagtgat agtctaaaat tcggaggaat 
tttttttctg gaaattggaa aatcacaaat 
aaaattggca taataaaaca tttctttttt 
attttttaag taagattttt ttgattttcc 
atttcaaaat aaaaatttcc gttttttttt 
tcgtcagaat tattgttgga agtggcggtt 
ctgaatttca aatttgaaaa aaaatcgaat 
aaaattctga ttaaaggtaa agggaaaata 
atatcaacaa aaaaaattct gaattttttg 
aacgaccgaa taaaatattt gaattcccgc 
cagtcttttt ataaaagaaa aaatctagaa 
aaattaaaaa tgacgtcact catttgcgcg 
tgatttttga aaaattgaaa aaaccattaa 

. agttttttat tcggtcatta tggggttatt 
aaaaattacg tcacaactct gtattcaagt 
ctacattact tgaataaccc cattagggtt 
tggctctact ccacctttaa atgaaaaaat 
aaaaaagaaa aaagttgctt tttggaaaaa 
ttccaaaaaa aattattttg cagctctaga 
aatgccacca cgacacgagt cgagactcgc 
ggacgccgag gacattgtca caatgtccga 
gaagaagcta ctggccagtc gtcaaacaaa 
tacgctggtt cttcggccgt ataccgtacc 

-Jtcgtcgtgaa atgcgcatcg ctgttccacc 
ctcagttgct gctgctgcca cgtctgggcc 
gtctacgggc ttgggatctc agcaaaattt 
taatgtgcaa aatatgcatc aaaatcaata 
ccgacaaatc ggagcagcat catcacacca 
aaaaccacaa gcctatcacc tggtgcaaca 
gcaggcgacg ttacagcgaa gaaatgcggc 



ccggaacttt taatttttga caaattgtcg 21360 
tgccaatttg ccggaacttt tcatttttgc 21420 
ccggcaattt tgcgatttgc cggaaatttc 21480 
aaaaattttc ggttttcccg tttttcgatg 21540 
ccgaaaaatc gaaacaattt tcggggttaa 21600 
agaattgaaa aaaaactctc aaaattccag 21660 
aaacgagcga caactccacc accatcatgg 21720 
tggaatatag ttgaagatta tgcactactt 21780 
catttagtcg aaaaatcggc gaatgaggga 21840 
gccgttaata agcagacaag atttttccgc 21900 
atgtttgttc ggccaaaaga gctcggacag 21960 
acgatgaaag tcgacctatc gcatactgaa 22020 
acggagagcc aatatgctca tgattatgga 22080 
tttaaaagtg ttcgagtggc ggcaacacgg 22140 
ggtagaggag gatggcttca taatagtcac 22200 
aagtggtttc taggccatgg ccgaggtgcc 22260 
tgttttccgc ccgttttctt tcgtttttca 22320 
tgataaataa atattttttg cagatgctaa 22380 
cagtgggcat gcagcggtga aagtgggcat 22440 
cctaaacttt ttctgaaaca tgatacatgt 22500 
cataacgaga ccgctgaaaa agttttgagg 22560 
gtcgagattt tcgcacaaaa agttgaattt 22620 
gttatgaaaa tcaggtaatt tcagcatcta 22680 
ttaggttttg tattcccgta atccatctat 22740 
cactgaatac ataatttttt cacttggaaa 22800 
tatcagtttt aataagaaaa aacgggaaaa 22860 
aaacaaagca agataaatgg ccgctgaaac 22920 
cacttttctt cgtccctcgt gaggaaaaag 22980 
tttttaaaat tggaaaaaat tgttaaattt 23040 
tttcgatttt tgtttgttaa aaaaaaaaag 23100 
ttttgaaaat tgggaacttc ttaatatcag 23160 
ggaaattcgg aaaacctgaa aattttcaac 23220 
tctgaaaatc tccaacaaaa aaaggtcaaa 232 80 
tttcacgatt agagttcagt attttttctt 23340 
aaactgtaga aaaatgatag aaaattaaca 23400 
gaccgtaatg accgaatata actgttgaaa 23460 
tgactttttc aatttttcaa gaataaaaaa 23520 
gcaaatgagt gactggttct ggccaattta 23580 
aaaccggcga atttagccag aaaacgcaaa 23640 
cggaatacaa atttaattag gccgtttctt 23700 
aaaatttaga aatttttttg aattttttac 23760 
ca.agtagtgt cggaaaatta aaaagtgtag 23820 
atataaaaac atgtatttaa atacattttg 23880 
tattttcttt agagcaaaaa aaaacatgtt 23940 
cgacaatttg tgattttgca atttccagaa 24000 
accaaaaaaa gccatttgaa aaattttatt 24060 
atctcgaaat ctgcaatctc taaacggcgg 24120 
cgaattcgac gtaaaaacca atattcgcct 24180 
cgagtcgatt gtcgcctatg aagcgagcaa 24240 
accctcacca cgtcaagatg tccgattcca 24300 
tgtgacaact gagtactcgg ctgcaccttc 24360 
gcttcagcct tcggctttat ctacgatttc 24420 
actaccatca -attcagcatt tgcagtcgtc 24480 
gcaaaattcg cataattctg agcaaagaaa 24540 
taattcaagt caaaatccgc caatacctat 24600 
acatgatcaa ggatctcagg ggcctggggg 24660 
gggatcacag caacagcagc agcagcagca 24720 
ggcggcggca gggtcgaatg tgcagtttat 24780 
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tcagcagcag cagcagcagc agcaatcggg 
atgaatttgc gcggggatag ccccggcgaa 
atcgtgtgaa aatctcaatt ttttacaatt 
aatttaggca ttcatccaga gcagggctgg 
aaaaacaaaa aattgaaatt tccgaaaaat 
ttttttgttt tttggttttt tttggtattt 
tttttcgaga ccaaaaaaac caaaaaatcc 
ctagacacgg caaacatttt ttttttttgg 
aaaaaaacca aaaaatcgat ttttcgtcaa 
ccctttcgcc aaaattgccg gatattttca 
acatcctgtg gggaaattgc tttaaaacat 
gattaaaaaa ttcaacaaaa aaattactag 
atctcagttt tcaacctaat tcctatttga 
cgcgcttgct ttattttttt ttattcattg 
ttttcttcat tttttgtgtg tttttggtgg 
cagaaagtct gttaaaaggt cattgaaaat 
atattttaca cagttttacg cattttcaat 
tgtttttttc agttcaattt ccattaaaaa 
aaataaggtt aataaataaa ataaatgaat 
aattcaattg gcggaaattc aaatatggaa 
tttcaaaaaa tcatataaaa tctagaacca 
tcattgtgac cccataggcg tgttttaaag 
gacgaggttt gaaaatgtcc ggcaattttg 
gtgaaaaatt ccaaatttca tgtggaaaac 
aaaaaactat ttggcgcgaa aacgcggata 
tttttaatgt ttttatgccg aattttactt 
gaagagcatt tccaattgtc tgtggagcgc 
tcaaggacaa agcttcgttg tcatgggctc 
agcatcgacc gtcggaggag gaggaggagg 
gcagccacaa caaagaatac agtacattcc 
aggtggtgga agaggaggct acggtagtac 
cagggttggt agaaatacaa aatcgcgaaa 
gcgatttgct ccgcccactt tcggaccaat 
attgggcgga gcagcgaatt gctgatgcga 
tctgcaaaat tctttaaaaa aaacaaaatc 
atcgaagaaa atcgcgattt ttgattcccc 
tgaaccaatc agcgttcgag gcatttgatt 
tgattggatt tttcagtttt taaattttta 
accatagatt ttgatgagaa atgatgaaaa 
aattaatcaa aaaatcttga aaaaaaattt 
ctgaaacgcc tcttttttat ttgtgcctcc 
tttgaaccaa tcagcgaccg agcgatccga 
ttttaaaaaa attcccgatt tgtttaatct 
aaatagaaaa aaattaaaaa aaaaaaaaca 
aaaaaaattt ttttaattga ttttttttcg 
aagaaaaaat ttgtttttga tttttttttc 
aaaaaaaaaa tgtttttctt gatttttttc 
agcaaaaaaa aaatttttta attgattttt 
aattcaaaaa aaaaattttt tattgacttt 
attcaaaaaa aaaaattttt tttttgattt 
aattcaaaaa aaaaatgttt ttcttgattt 
aat-tcagcaa aaaaaaaatt ttttaattga 
ccaaaaattc aaaaaaaaaa ttttttattg 
caaaaattca aaaaaaaaaa tttttttttt 
ccaaaaattc aaaaaaaaat tttttattga 
aaaaattcag caaaaaaaaa attttttaat 
gaccaaaaat tcaacaaaaa aaaaattttt 
ttgaccaaaa attcaacaaa aaaaaatttt 



taaaaattgt atggatttat aggaaattat 24840 
aaacgggaaa aagcgacaat ttaaaaaaaa 24900 
ttgaaagtaa ttttttattg aaaaaagtgg 24960 
gaccaaaaaa aatttttgga ccaaaaacca 25020 
caacttaagc atcaaaaatt ttttgttttt 25080 
tgacgaaaaa acgatttttt ggttttttgg 25140 
aaaaaaatgt ttgccgtgtc tagtctcgac 25200 
attttttggt ttttttggtc ccgaaaaacc 25260 
aataccaaaa aaaaacaaag aattcccagc 25320 
aacctcaaaa aaaatttata aaggtggact 25380 
gcctatgggc tcacaatgac cgaatatcat 25440 
attttatgtg attttttgaa aattaaaaaa 25500 
atttccgcca atttgatttg ttcgatggag 25560 
attttatttt tattagcatt atttcactga 25620 
gaattgaaat gaaaaaaaac aagataaatg 25680 
gcttaaaacg gcaacaagct tgaaatttgt 25740 
gactttttaa caaactttcc gcatttatct 25800 
acacacaaaa aaaaatgaag aaaatcagtg 25860 
aaaaatgatg caagcgcgct ccaacgaacg 25920 
ttaggtgaaa actgagattt ttttttcaat 25980 
ttttttgaat tttttaatca tgatattcgg 26040 
caatttcccc acagggtgta gtccaccttt 26100 
ccgaaattgc cggaaacttg agatttttca 26160 
tgtttttttg ttttttggaa aatgcaacaa 26220 
gttttgccaa ttttcaagga ttttccgcta 26280 
taaaaaatca taattattcg gaaaatgctc 26340 
gtttgactaa tcagataata ttccaggcgg 26400 
gcagagctca tcaaatgatg gacaaggtgg 26460 
atcacaacag cctcaccagc agcagcagca 26520 
acaagttacc ggtagcggaa ataacggtgg 26580 
actggtcatg ccaagaggag gacgtgttgt 26640 
aaacggcatt tccggcttcc cgaccaatca 26700 
ccgctgaccg aggcatttga ttggtttgaa 26760 
aatacgggaa gttctcattt tgatggaaat 26820 
ttctcaaatt cggaaaaaat cacaaaggaa 26880 
gaccaatcag cgatttgctc cgcccacttt 26940 
ggttcaaaac tgggcggagc agcgagttgc 27000 
aagctttttt taacggaaaa attcgagaaa 27060 
ttttcatgaa aaaatggaaa aatgattgga 27120 
tttttcagag aaaatgcttc atttttggct 27180 
ccgaccaatc agcaatttgc tccgcccact 27240 
ttggtttgaa attgggcgga gctaaaatga 27300 
agaaatttag aaaaaagaaa tatagaaaaa 27360 
aaaaatcgga aaacgtcgga aaatattacg 27420 
aaaaaaacta aaattttaac caaaaattca 27480 
gaaaaaaaaa aaaattttaa ccaaaaattc 27540 
caaaaaaact aaaattttga ccaaaaattc 27600 
ttttcgaaaa aaaataaaat tttaaccaaa 27660 
tttcgaaaaa aactcaaatt ttaaccaaaa 27720 
tttccgaaaa aaactaaaat tttaaccaaa 27780 
ttttccaaaa aaactaaaat tttgaccaaa 27840 
tttttttttc gaaaaaaaat aaaattttaa 27900 
acttttttcg aaaaaaactc aaattttaac 27960 
gattttttcc gaaaaaaact aaaattttaa 28020 
tttttttcca aaaaaactaa aattttgacc 28080 
tgattttttt tcgaaaaaaa ctaaaatttt 28140 
tattgatttt tttcgaaaaa aactaaaatt 28200 
tccagccagc gggaactcta ccaggcggtg 28260 
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gacgtctgta tgtcgatcat aaccgtcatc catatccaat gtcgtcaaat gttgtgccag 28320 
tacgtgttct accagccacg caacaaggac aacaacgaat gatgacagga caacgtcgtc 28380 
cggctccagc gcccggtact gtcgccgcaa tggtgttgcc gaatcgagga gctggtggaa 28440 
ttccgcaaat gcgcagtttg cagtgagttt tgcacggaaa ttggacgatt ttcagcgaaa 28500 
ttttcgggaa aaatggctat tttgtgtttg aaattgcgaa atttcacgat ttcgtcttaa 28560 
atacggtgcc aacctacccc atgacggttt gatctacaaa aaacgcggga atttttcaca 28620 
caaaaatatg tgagacgtct gcacgttctt aaccaatcgg ttgaaaactc tgccgcattt 28680 
ttgtagatct acggtagatc actgcagatt ttaagagaga aaaataaata aataatccca 28740 
caaggttttt aaaatttttt tttcaatcgt aaaaaatagc gaaaaattgt ttttcgcgtc 28800 
gagaccctac gcacattttt ttgcaatttt cgcttcaaaa ttacggtacc gggtctcgac 28860 
acgacatttt tattgtgtaa aatacacaat tttttggaat tttcatcgat tcgaatttaa 28920 
atatttttaa atgatttaat taattcttaa cgaaaaaaaa aaagttcgaa actgcagtac 28980 
tctttaaagg cgcacacatg tatgtattta taaaaaatgt cgtgtcaaga ccgtactttt 29040 
ggctcacaaa ttgcaaaata ttgcggaatt ttttttaatt ttagataaaa aaaaacatga 29100 
aaaatctatg gaaactaaac ttataattta aaaaaaaatt tttttaaggt ggactacgct 29160 
cagtggggaa attgctttaa aacacgccta tgaggcccca atgactgaat atcatgatta 29220 
aaacaatcaa aaaaaatttt ctagatttta tatgattttt tgaaaattgg aaaaatcaca 29280 
gttttcacct aattcttttt gaatttccgc caattggatt agttcggtgg agcgcgctta 29340 
cattattttt aattatttat tttatttatt ctcgttattt gactgatttt cttcattttt 29400 
tgtgtgtttt cctcggaaaa aggaagaaat aaacaagaca aatgcaaaat gtttgttaaa. 29460 
aagtaattga aaatgcgtaa aactttgata ttctgagttc cgacgacaac aagcctgaaa 29520 . 
ttagtatatt tcacagtttt tctcattttc aattactttt taacaaacat tttgcatttg 29580 
tcttgtgtat ttcttccatt ttccgaggaa aaaacataga aaatgaagaa aatcggtcaa 29640 
ataacgagaa taaataaaat taattttaaa aaagatgcaa gtgcgctcca ccgaacaaat 29700 
ccaattggcg gaaattcaaa tatggaatta ggggaaaact gtgatttttc ccattttcaa 29760 
aaaatcatat aaaatttgga aaattttttt gaattttttt aatcatgata ttcggtcatt 29820 
ggcgccccat aggcgtgttt taaagcaatt tccccactga gcgtagtcca catttaattt 29880 
tccaaaacag cacatgctaa tcctccaagt tattccagac gaggcagtta caccggcggt 29940 
ggtggtcagc aacgaatcaa cgtgatggtt caaccacaac aaatgcgcag caacaatggc 30000 
ggtggagtcg gtggccaagg aggcctccag ggtggtccag gaggtccgca aggaattcgt 30060 
cggccactcg tcggacggcc actacaacga ggagtcgata atcaggcgcc gacggttgct 30120 
caggtcgttg ttgctccgcc gcaaggaatg cagcaggcat cacaaggacc acccgtactt 30180 
catatgcaga gagcggtttc catgcaaatg ccgacgagtc atcatcatca aggccaacag 30240 
caggctcctc cgcagagctc acagcaggct tcgcaacagg ctcccacatc ggattctggg 30300 
acgagtgctc cgccacgaca agcaccacca ccacaaaact agaattttcc cctattatcc 30360 
tattttaccc cccaaaactc tattaattaa ataatttcct tcctattttt ttcttcgtgt 30420 
gaagattatt tgtcccccaa ccaagggtgt cggtttttcg atttttcgac gtttttcaaa 30480 
aaaatttcga tttttcgaaa aattagcttc atattttggc tattactctg ctttttagaa 30540 
gaaatttgta tgttttttct tgaaaatata agcaaaatta gatttaaaaa aaatcatatt 30600 
ttatggttaa ttttctgaac atatttttca attttcgatt ttcacagaaa aacatcgaag 30660 
aatcgacaaa atcgaaaaat atgttccgaa aattaaccat aaaatatgat tttttttaaa 30720 
atctaattgt _gattatattt ataagaaaaa acatacaaat ttcttctaaa aagcagagta 30780 
atagccaaaa tatgaagcta atttttgaaa aaacgaaaaa ttttcgattt tccaaagaat 30840 
cgaaaaatcg aaaaatgaca cccttgcccc caactatctc tgtatattat tcatctatta 30900 
ttgattgttt ctttttgttc ctcgaaattt tttgaaatta aagttctctt ccccaccccg 30960 
atttccgttg ctttattaat cgcgattgat taattgtttt tccataaatc cccaactatt 31020 
tatctctgta tattattcat ttatattatt tatcttttat ctgtgtcgat ttacggtatc 31080 
tccgggccgt atgattttga attctcttct caaataaaat tgtttttcat ctaacatttg 31140 
atacgtgttt ttctgatttt tttgtatata tattttccat gtatatattt ctttttcttt 31200 
tttctttgct ccaactttat tttaaataat gcttttttat caagagattt tttaaaaaat 31260 
cgattttttt taaagccagg aattctgaag aatcgaaaaa aatggaacta tttttcaaat 31320 
aatgagaaag tttttttttt tcaagaaaaa aataataaaa ttctgatttt tttaataaaa 31380 
atttaataag tttttgaaga ttttcattga aaacatctaa actattcgat ttttgatttt 31440 
aaattttgaa aatagaattt tttaatatat ttttttcaaa tcgttaaaaa gagaatgccg 31500 
gaatttttta aaaattcttt aaatttagaa ataatcggaa aattttcgat tataaaacgc 31560 
tgtataaaac gaaaaaaagt ggattttgat gaaagaaaaa attttcttgt agtttttttc 31620 
agaaaaaaat tactttttat tctccatttt ttgttgttga atttttgaga aaaaactcat 31680 
tttgaaaaaa tcgaattttt tatatttttt ctaatcgtaa aaaaaatttt aaaaatgaat 31740 
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tccggtaatt ttttaaaaaa taatattaat 
cataaaaatc taaaaatttt tgattttaaa 
tttcaaatcg taaaaaaaga aacaataaac 
attataaatt tatagttctt tactttttta 
tcgatttttg gttttttaaa ataatcaaaa 
tgcaaagaaa atgaaatccg gcaaaaattg 
ttttttccaa tatttctata aattcttgat 
attttggaaa aattgaattt ttgtatattt 
cgaaagcgga tttttttctg ctattttgtt 
aattgatagt ttcgaccact ctggctagac 
agaatggccg tggtctcatc agtagctagc 
aaagtctcta aaattttgaa aaaatcgatt 
tcttttcaaa tcttaaaaaa caattttaaa 
ataaatctat agttttttag tttttaaaaa 
gcttttgact tttgaaataa tcgaaaatgt 
ttcgattttt tcaagataaa aaagcgaatt 
tctgtagttt ttttaaagac tctcataaaa 
ataattttaa aaaaatttta atatttttta 
gaataatttt caggatctca aaatatccca 
agattggata aagaaggagg tcgaggacca 
catagccgtc tcgcgtcagc tcgaagggaa 
gtcttctcaa atcaaaagag aagttgaaga 
ccgttcggag cccgtggatg ttaagccgtc 
gacctggacg acggctcggc gccaagcaag 
catcgattcg cgtatgtgaa tgttggagtc 
atggaaactt cattgaatga aattaggtaa 
attttaattc aatttttttt ttcagaatcg 
tgcccttcag gacttcgatt cccatgaagc 
ctgccgagaa gatccgctcg ttaatcgatc 
gaaccgaaac tcaaaaattc ttgtggagcg 
acataatggc cagcaggtaa gctttcgaac 
cgcaaccgat gacgtgccat atgaggcagc 
tgtaaatgat gccagctctc tgaaaggctt 
aatgaaccgt gatgacgaaa tacatctgta 
tttaattttt tttttcagtt atattctgtc 
ccaaaatagc ctcaaattcg gaattatgct 
taagacgatt ttgaattttt gtagctgcct 
catgtcgaaa gctggatgga ggatcgtgag 
atgatcccct tcccagcaca caagacagtt 
agttttgatg ataatgaatt tttttacggt 
ctattgtgca gcacacacgg cgtgtaaatt 
aggccatggc cgaggatccg actagatctt 
cattttggag ggaaattgaa ggaaattgaa 
tttaaatgac cagaacaaaa caaataaact 
ctgggatgat gttatatgaa ctctttcacc 
caagcgcgct caaacgcgaa aacgctcgat 
ttatttcaag ctatgctcgt ttttttctgt 
caaaaagtgc ctgaaaacga aaaaaaaccg 
atttcgccgc caaaacccaa cgagacccaa 
gtggagcgcg cttgtatata aaaggactta 
caatgtatgt caaattcact cgattctcca 
gttggcaagg cgttgacttg aataaatcgg 
ttccttttgt aattatgttc taaaaagtca 
gtggggattt tgtctaaatg cacttattat 
cactccaaaa agtttagttt tttcataatt 
ttttgaaaaa tgcgagcttt tgaggtaatt 
aattatttaa tacagataat ttaaacaaaa 
gttttggtcg atttccaaaa ttatgagtgg 
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ctatagtttt gtagttaaaa aaatgtttca 31800 
ttaaaaaaaa atcgaatttt ttaaaatttt 31860 
aaaagaattc cggaaaaaaa ttatattatg 31920 
aaagatttta ttttaaaaat tctaaaatga 31980 
atgtttgatt ttttttaaac gtgaaaaaaa 32040 
taatataatt ataaatctat acttttgtgg 32100 
ttttaaaata atcaaaagtt ttgattttaa 32160 
ttctaatcgt aaaaaaattt ttaaaaaaat 32220 
ttttttttga aaaccggaaa aaataccaaa 32280 
taccaaaatt gaattttttt tttcgaattg 32340 
cattctcttt ttatttcaat ttttaagaaa 32400 
ttttttactt actttgatac tttttttata 32460 
aattgaattc cggaaatttt tttaaataat 32520 
atatattttt ataaaaatct aaaaagttcg 32580 
ttgttttaaa ttttgaaaaa atataaaaaa 32640 
ttttgaattt ttttcaaatc gtaaaaaatg 32700 
atctgaaatg ttcgattttt tatttttaaa 32760 
tcgtgcgaat tttttaccaa ctataatttg 32820 
caatcgcgca aatatgccag gaagcaatga 32880 
ggacaccaac gccaacagct cgagctccag 32940 
ttctgctgtt cctgacgcca tcgaccttct 33000 
ggaggatgat cgcaacgatg. agactggacc 33060 
tccaaaacgc ccaacgaaga ggtcagccga 33120 
aaacggtcta cggcgggaga cggttcaact 33180 
cgccatccat acgatccacg ccatcttgtc 33240 
ggaattattg aaaataatta ttatatatcc 33300 
aagatttcga aataatccag tatcttccga 33360 
tagtgatctt cgcagtgaga agtgaagaat 33420 
cttcgatgtg gatcgcggct tttggtggcg 33480 
agctgacggt ggaggatttc gtcaaggcac 33540 
atacttaatt ttttaaaaac taaaattcag 33600 
catggcggat cgagaatcgc tcaaacaagc 33660 
gaaggaggta ataatttaga aatgacagaa 33720 
aaaaaattat aaaaaattct aagctccgtt 33780 
atagcggcct atttctctgg aaaaaaaaat 33840 
tcgatttttt ttctgcggta gtcctgaatt 33900 
ttcgccacaa ttacgttaaa catttcagag 33960 
taagatgcgg aaagatctca atggagcctg 34020 
ttaattttgt gtctgtatag ttttatatta 34080 
tttatccatc acttggctcg attgaagctc 34140 
agtgcatcta acctaggaaa tgcgatttct 34200 
ttttgatggt gtttgtacag agttaaattt 34260 
agagaaatta atttaataat attaatttga 34320 
gaatgacaag ccaatcgata ttcgtccaga 34380 
tgaaacattt aagttttttt aataaaagag 34440 
ccacttaatc tggattttgt gccgattcat 34500 
tatgtttcat taaaaagacc gaaaacataa 34560 
gcgacattaa ttgaaaaatt caaaactaca 34620 
agtttcagcg cggagcgttt ccacttggcc 34680 
attttttaaa atacttaccg cagttacttc 34740 
ttgcagggtt actaaaatat gctccaaata 34800 
gatggttatc ttggatgatt gcagttcgat 34860 
ttgtaatcat ttaaaagtgg agtagcgcca 34 920 
gatccaaaac aaccgaatat catcataaaa 34980 
tcctgtcaaa gttttggcaa attggcaaaa 35040 
taaggaaatg tcgcatgttt cgacccctac 35100 
ttaaaacata aaaatgtaga aatttttttt 35160 
caaaaactga gtaattgcca ctttttgaca 35220 
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gtaaataaaa aatgttcaaa attttttgaa acgttttatc atgatatttg gccattatgg 35280 

gagcaaatga gtggtttatc tattttttca ctggcgctac tccaccttta agcatgtctg 35340 

cctcaccata atcccattta atccaacgtt tcttagattt ggattcgaat atatttgaat 35400 

gactggaaaa tatgttacgt taccattcaa tgcaccaata taagtcattt gatcgagaaa 35460 

attcaaatcg gtgagatttg tgtttctgat agtcaatgtt ccgaataaaa attgtaacac 35520 

tcctaatttg gaaacatatt tttcatcttc atggtctatt aatagatctc caaggatata 35580 

catacatgta tctgatagtt tgctcattga ttcaaatgtg caataaaatg acgcatccaa 35640 

tggaccagga tctttgcaaa gtttcgcttc aatgttttca gtagaaattc caaggttcaa 35700 

tagggcaact atctcagtaa tggtgacaca aaaatcagga tgaaggtttt caaaattgaa 35760 

gtattgcctt ttattgtatg tactgtattg tatcatactg gtttgctcaa ctgtatctat 35820 

aactttctga aattttatgt cattattttc agaaatcgca ctaggcaggc aagcctgcct 35880 

taccgtcaga attggcagtc ccagtcgaat catttccgga ttatcttgta cattcaatgc 35940 

tacactagct atatccgagt tatattcgat agtttgcagg ttttgtaaaa acgacaaact 36000 

ctgtagatta gtgttccgaa ttgcaataga tcctcgaatc attgtgacat tcaaaaatga 36060 

atcataatcg aaggttgcat taatattcac taaatttaga ccagaatcta gagttttgca 36120 

tttggagtac tccttaacat ttgatacatt aactttttca ccatcacatc ctgaaatttg 36180 

actattttta tactgttaaa aaattgtttc tcaccacaat cctttaagtt ccctctgaca 36240 

atgagctcat tatacatgtg taaaaagccg ccatcacagg aaaattccag tttcggatta 36300 

ttctcgattc taatatcaca cgcctcgata ccccgatcac ggtacaagta gagatcgtag 36360 

agcacactgg ggtcgtttaa ttgtgaattg tttcggatgt aaacaccgtc tgaaatctga 3 6420 

agtttaagaa aaaattaagt aagttttaat ctacatgttg atccgttttt gttgaaagta 36480 

tcaaaaaatt aactggagtc agaatgtctc atttcgtttt gatcttcaaa aaatgcggga 36540 

gttcagacct agacatctcg tctgatttcg catggttaag agcgttctga cgtcacaatt 3 6600 

tttctgaaaa aatattcccg cattttttgt agatcaaatt aaaatgagac agcctgacac 3 6660 

cacgtggagt tccttatata caaaaaagtt gatttttcgc tcgtgatttt tcgttgtaac 36720 

atcatgaaaa atccagtgtt ctctgcaaac cactaaaatc cacttttttg tttcagccgc 36780 

tccgcaagca gcttcgtcga ggtcatggca gcggccgagt ttcccactcc gctgaaactc 3 6840 

ggcacttaat atatgaacga ctaagctagc agggccgcca ttctacctta ccagcaaaaa 36900 

tgaattcgtt cacttacaca catcacacac cacattaaag tttccttttt ctttgtcagc 36960 

tgtaaaaacc gaaaggcttg tcagactagt attctcaata ttaaatc 37007 



<210> 22 
<211> 5656 
<212> DNA 

<213> Caenorhabditis elegans 
<400> 22 

atgccggcaa caccggtgcg tgcttcaagt actcgaataa gcagacgtac atcatcaaga 60 

tcagtggctg atgatcagcc atcaacttcg tctgcggtgg ctccacctcc ttcacccatt 120 

gccatagaaa ctgatgaaga tgcggtagtt gaggaggaga aaaagaagaa aaagacatca 180 

gatgatttgg aaattatcac tccaagaact ccagtcgatc ggcgaattcc ctacatttgc 240 

tcgattcttt tgactgaaaa tcgatcgatt cgcgataaat tggttctgag cagcggtcca 300 

gttcgtcaag aagatcacga agaacagatt gctcgagctc aacggataca gccagttgtc 360 

gatcaaattc aacgagtcga gcaaatcata ctcaatggtt cagtggaaga tattctgaaa 420 

gatcctcgat tcgcagtaat ggcagatctc acaaaagaac caccaccaac acctgcacct 480 

cctcctccaa tccagaagac aatgcaaccg attgaggtga aaattgagga ttcagagggc 540 

tcaaatacgg ctcaaccgag tgttctgccc agttgtggag gaggagagac gaatgtggaa 600 

agagccgcca aaagagaagc gcatgtattg gctcgaatcg ccgagctccg taagaacggc 660 

ttatggtcga acagtcgtct gccaaagtgc gtcgaacctg aacgtaataa aacgcattgg 720 

,gattatctac tggaagaggt caaatggatg gcagttgatt tccgaaccga gacgaatacg 780 

aagcgaaaaa tcgccaaagt tatagctcac gccattgcga aacagcaccg cgacaagcag 840 

atcgagattg agagagccgc cgaacgggag atcaaggaga agcgaaaaat gtgtgcagga 900 

atcgcgaaga tggtacggga tttctggtcg tctacggata aagttgtgga tattcgagcg 960 

aaggaagttc tggagtcgag gctcaggaag gcgagaaata agcatttgat gtttgtaatt 1020 

ggacaagtcg atgaaatgag caatattgtg caagaaggac ttgtttcatc gtcgaaatcc 1080 

ccatcaattg catcggatcg agatgataaa gatgaagaat tcaaagcacc tggctctgat 1140 
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tcagaatctg acgatgagca gacaattgca aacgcggaaa agtcacagaa aaaggaagat 1200 
gttcgacagg aagttgatgc tcttcaaaac gaggcaactg tggatatgga tgactttttg 1260 
tacactttac cgccggaata tctgaaggct tatggtctga cgcaggagga tttggaggag 132 0 
atgaagcgcg agaaattgga ggagcagaag gctcggaagg aagcttgtgg tgataatgag 1380 
gagaaaatgg agattgatga aagcccatca tcagatgctc aaaagccttc cacctcaagc 1440 
tcagatctca ccgccgagca gcttcaagat ccaacagctg aagacggcaa cggtgatggt 1500 
catggtgtac ttgaaaacgt ggattacgtg aagctcaaca gtcaggatag tgatgaacga 1560 
caacaagagt tggcgaatat cgcagaagaa gcgctgaaat tccagccaaa aggatataca 1620 
cttgagacga cacaagtcaa gacgcccgta ccattcctga ttcgaggaca actgagagaa 1680 
tatcaaatgg ttggattgga ttggatggtt acactttatg agaagaattt gaatggaatt 1740 
cttgccgacg agatgggcct gggaaagacg attcaaacga tttccctgct ggctcatatg 1800 
gcttgtagtg aatcgatttg gggaccacac ttgattgttg tgccgacgtc tgtcattctg 1860 
aattgggaga tggagttcaa gaaatggtgt ccggctctga agattttgac gtattttggt 1920 
acggcgaagg agcgtgccga gaagcggaag ggatggatga agccgaattg tttccatgtg 1980 
tgcatcacat catacaagac ggttactcaa gatattagag cttttaagca gagggcctgg 2040 
cagtacctaa ttctcgatga agctcaaaat atcaaaaact ggaagtccca acgttggcag 2100 
gctcttctga atgtccgtgc tcgacgtcgc cttctcctga ccggaactcc acttcagaac 2160 
tctctaatgg aactgtggtc gttgatgcat tttttgatgc caacaatatt ctcaagtcat 2220 
gatgatttca aggattggtt ctcgaatccg ttgacaggga tgatggaagg aaatatggaa 2280 
ttcaatgctc cactaatcgg acgacttcac aaagtgctcc gtccgtttat tctgcggcgg 2340 
ctcaagaagg aagttgagaa gcagctgcca gagaagactg agcatattgt gaattgttcg 2400 
ttgtcaaagc ggcagagata cctgtacgat gactttatga gtcgtagatc aacaaaggag 2460 
aatctaaagt ctggaaatat gatgtcggtg ctcaacattg tgatgcaact ccgaaaatgt 2520 
tgtaatcatc cgaatctctt cgagccgcgg ccagttgttg ctccgttcgt cgttgagaag 2580 
cttcagctcg atgttccggc tcgtctcttt gaaatttcgc agcaagatcc ctcctcctcc 2640 
tcagctagtc aaattccgga aattttcaat ttatccaaaa tcggctatca atcttccgtt 2700 
cgatctgcaa aaccactcat cgaagagctt gaagcaatga gcacttatcc ggagccacga 2760 
gcaccagaag ttggcggatt tcggttcaat cggacggctt ttgttgcaaa gaatccgcat 2820 
acggaagagt cggaggacga aggtgttatg agaagtcgtg ttctgccaaa accaattaat 2880 
ggaacagctc aaccacttca aaatggaaat tcaataccac aaaatgctcc aaatcgtcca 2940 
caaacttcat gcattcgttc aaaaaccgtc gtaaatacag ttccactgac catctccacc 3000 
gatcgaagtg gttttcattt taatatggcc aatgttggaa gaggtgttgt tcgtttggat 3060 
gattcagcac gtatgagccc accgctcaaa cgtcagaagc tcaccggaac tgcaacgaat 3120 
tggagtgatt atgttccgcg acacgttgtt gaaaagatgg aagaatcgag aaaaaaccag 3180 
ctggaaattg ttcgaaggcg atttgagatg attcgtgctc cgattattcc actggaaatg 3240 
gttgcgctgg ttcgagagga aattattgca gaatttccac gtttggctgt ggaagaggac 3300 
gaggttgtgc aggagaggct tttggagtat tgcgagttgt tggtgcaaag attcggaatg 3360 
tacgtcgaac cagtgctgac cgatgcttgg cagtgtcgtc catcatcgtc tggtcttcca 3420 
tcatatattc gcaacaattt atcaaatatc gagctgaatt ctcgttctct tctcctcaac 3480 
acctccacta atttcgatac ccgaatgtcg atctcacgtg ctcttcaatt cccagaactc 3540 
cgtctgatcg agtacgattg tggaaagctt cagacgttgg ctgttctgct tcgtcagttg 3600 
tacctgtaca agcacagatg tctgatcttc acgcaaatgt caaagatgct cgacgttctg 3660 
cagaccttcc tttctcatca cggttatcag tatttccgcc tcgacggtac cactggtgtc 3720 
gaacaaagac aggcgatgat ggagcggttc aacgcggatc ccaaggtgtt ttgcttcatt 3780 
ctgtcgacga gatccggtgg tgttggagtc aatctaaccg gtgctgacac tgtgatcttc 3840 
tacgattcgg attggaatcc gacgatggat gctcaggctc aggatagatg tcatcgtatc 3900 
ggacagacga ggaatgtctc gatttatcga ttgatttccg agcgaacaat tgaggagaat 3 960 
attctgagaa aggcaacaca gaagcggcga cttggagagt tggcaattga cgaggctggc 4020 
ttcacacccg agttcttcaa acaatctgac agtattcggg atctttttga tggagagaat 4080 
gtggaagtga ctgctgtggc agatgttgcg acgacgatga gcgagaaaga aatggaggtt 4140 
gcgatggcaa agtgtgaaga tgaagctgat gtgaatgcgg cgaagattgc ggtggccgag 4200 
gcgaacgttg ataatgcgga gtttgatgag aaatcattgc cgccgatgag caatttgcaa 4260 
■ggagatgagg aggctgatga gaagtatatg gagttgatac aacagctcaa accaatcgaa 4320 
cgatatgcca ttaactttct tgagacacag tacaagccag aatttgagga agaatgcaaa 4380 
gaggcagagg ctcttatcga ccaaaaacgc gaagaatggg acaaaaatct caacgatacc 4440 
gccgtcattg acctcgacga ttcggatagt ctgctgctca acgatccttc gacttctgcc 4500 
gatttttatc agagctcaag tcttttagac gagataaaat tctacgacga gctggacgat 4560 
atcatgccaa tctggcttcc accatcacca ccagattcgg atgcggattt cgacttgaga 4620 
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atggaagatg attgtctcga tctgatgtat gaaattgaac aaatgaacga ggctcgccta 4680 
ccacaagttt gtcatgaaat gagacgtccg ttggctgaaa aacagcagaa acagaacacg 4740 
ttgaatgcgt ttaatgacat tctatcggca aaagaaaagg aatcggtgta cgatgcggtc 4800 
aacaagtgcc ttcaaatgcc acaatccgaa gcgatcacag cagaatctgc agcgtctcca 4860 
gcatacacgg aacactcatc attctcgatg gatgatacaa gccaggatgc gaagattgag 4920 
ccaagtttga ctgaaaatca acaacccacc accaccgcca ctactactac tacagtaccc 4980 
caacaacaac aacaacagca gcagcaaaaa tcgtcgaaaa agaagagaaa tgataatcga 5040 
acggctcaaa atcgaacagc tgaaaatggt gtgaaacgag cgacaactcc accaccatca 5100 
tggcgtgaag agccagatta tgatggagcc gaatggaata tagttgaaga ttatgcacta 5160 
cttcaagcag ttcaagtcga atttgcaaat gctcatttag tcgaaaaatc ggcgaatgag 5220 
ggaatggtgt tgaactggga attcgtgtcg aatgccgtta ataagcagac aagatttttc 5280 
cgctcggccc gtcaatgctc aattcgatat caaatgtttg ttcggccaaa agagctcgga 5340 
cagttggtgg cttctgatcc gatttccaag aaaacgatga aagtcgacct atcgcatact 54 00 
gaattatctc atttgagaaa aggacgaatg actacggaga gccaatatgc tcatgattat 5460 
ggaatattga ctgataagaa acatgtgaat agatttaaaa gtgttcgagt ggcggcaaca 5520 
cggagacctg ttcagttttg gagaggccct aaaggtagag gaggatggct tcataatagt 5580 
cactgcaact ttttcctcac gagggacgag aaaaagtggt ttctaggcca tggccgaggt 5640 
gccgacaagt ttcagc 5656 



<210> 23 
<211> 1885 
<212> PRT 

<213> Caenorhabditis elegans 
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705 710 715 720 
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835 * 840 845 

Pro Arg Pro Val Val Ala Pro Phe Val Val Glu Lys Leu Gin Leu Asp 
850 855 860 

• Val Pro Ala Arg Leu Phe Glu He Ser Gin Gin Asp Pro Ser Ser Ser 
865 ' 870 875 880 

Ser Ala Ser Gin He Pro Glu He Phe Asn Leu Ser Lys He Gly Tyr 

885 890 * 895 

Gin Ser Ser Val Arg Ser Ala Lys Pro Leu He Glu Glu Leu Glu Ala 

900 905 910 

Met Ser Thr Tyr Pro Glu Pro Arg Ala Pro Glu Val Gly Gly Phe Arg 

915 920 925 

Phe Asn Arg Thr Ala Phe Val Ala Lys Asn Pro His Thr Glu Glu Ser 

930 935 940 

Glu Asp Glu Gly Val Met Arg Ser Arg Val Leu Pro Lys Pro He Asn 
945 950 955 ~ 960 

Gly Thr Ala Gin Pro Leu Gin Asn Gly Asn Ser He Pro Gin Asn Ala 

965 970 975 

Pro Asn Arg Pro Gin Thr Ser Cys He Arg Ser Lys Thr Val Val Asn 

980 985 " 990 

Thr Val Pro Leu Thr He Ser Thr Asp Arg Ser Gly Phe His Phe Asn 

995 1000 1005 

Met Ala Asn Val Gly Arg Gly Val Val Arg Leu Asp Asp Ser Ala Arg 

1010 1015 1020 

Met Ser Pro Pro Leu Lys Arg Gin Lys Leu Thr Gly Thr Ala Thr Asn 
1025 1030 1035 1040 

Trp Ser Asp Tyr Val Pro Arg His Val Val Glu Lys Met Glu Glu Ser 

1045 1050 1055 

Arg Lys Asn Gin Leu Glu He Val Arg Arg Arg Phe Glu Met He Arg 

1060 1065 ^ 1070 

Ala Pro He He Pro Leu Glu Met Val Ala Leu Val Arg Glu Glu He 

1075 1080 1085 

He Ala Glu Phe Pro Arg Leu Ala Val Glu Glu Asp Glu Val Val Gin 

1090 1095 lioo 

Glu Arg Leu Leu Glu Tyr Cys Glu Leu Leu Val Gin Arg Phe Gly Met 
1105 1110 ins ~ 1120 

Tyr Val Glu Pro Val Leu Thr Asp Ala Trp Gin Cys Arg Pro Ser Ser 

1125 H30 H35 

Ser Gly Leu Pro Ser Tyr He Arg Asn Asn Leu Ser Asn He Glu Leu 

H40 H45 H50 

Asn Ser Arg Ser Leu Leu Leu Asn Thr Ser Thr Asn Phe Asp Thr Arg 

1155 H60 1165 

Met Ser He Ser Arg Ala Leu Gin Phe Pro Glu Leu Arg Leu He Glu 



-47- 



WO 2004/024084 



PCTYUS2003/028626 



1170 1175 1180 

Tyr Asp Cys Gly Lys Leu Gin Thr Leu Ala Val Leu Leu Arg Gin Leu 
1185 1190 1195 1200 

Tyr Leu Tyr Lys His Arg Cys Leu lie Phe Thr Gin Met Ser Lys Met 

1205 1210 1215 

Leu Asp Val Leu Gin Thr Phe Leu Ser His His Gly Tyr Gin Tyr Phe 

1220 1225 1230 

Arg Leu Asp Gly Thr Thr Gly Val Glu Gin Arg Gin Ala Met Met Glu 

1235 ' 1240 1245 

Arg Phe Asn Ala Asp Pro Lys Val Phe Cys Phe lie Leu Ser Thr Arg 

1250 1255 * 1260 

Ser Gly Gly Val Gly Val Asn Leu Thr Gly Ala Asp Thr Val He Phe 
1265 1270 1275 1280 

Tyr Asp Ser Asp Trp Asn Pro Thr Met Asp Ala Gin Ala Gin Asp Arg 

1285 1290 1295 

Cys His Arg He Gly Gin Thr Arg Asn Val Ser He Tyr Arg Leu He 

1300 1305 1310 

Ser Glu Arg Thr He Glu Glu Asn He Leu Arg Lys Ala Thr Gin Lys 

1315 1320 1325 

Arg Arg Leu Gly Glu Leu Ala He Asp Glu Ala Gly Phe Thr Pro Glu 

1330 1335 1340 

Phe Phe Lys Gin Ser Asp Ser He Arg Asp Leu Phe Asp Gly Glu Asn . 
1345 1350 1355 * 1360 

Val Glu Val Thr Ala Val Ala Asp Val Ala Thr Thr Met Ser Glu Lys 

1365 1370 . 1375 

Glu Met Glu Val Ala Met Ala Lys Cys Glu Asp Glu Ala Asp Val Asn 

1380 1385 1390 

Ala Ala Lys He Ala Val Ala Glu Ala Asn Val Asp Asn Ala Glu Phe 

1395 1400 1405 

Asp Glu Lys Ser Leu Pro Pro Met Ser Asn Leu Gin Gly Asp Glu Glu 

1410 1415 1420 

Ala Asp Glu Lys Tyr Met' Glu Leu He Gin Gin Leu Lys Pro He Glu 
1425 1430 1435 ^ 1440 

Arg Tyr Ala He Asn Phe Leu Glu Thr Gin Tyr Lys Pro Glu Phe Glu 

1445 1450 1455 

Glu Glu Cys Lys Glu Ala Glu Ala Leu He Asp Gin Lys Arg Glu Glu 

1460 1465 * 1470 

Trp Asp Lys Asn Leu Asn Asp Thr Ala Val He Asp Leu Asp Asp Ser 

1475 1480 1485 

Asp Ser Leu Leu Leu Asn Asp Pro Ser Thr Ser Ala Asp Phe Tyr Gin 

1490 1495 " 1500 

Ser Ser Ser Leu Leu Asp Glu He Lys Phe Tyr Asp Glu Leu Asp Asp 
1505 1510 1515 1520 

He Met Pro He Trp Leu Pro Pro Ser Pro Pro Asp Ser Asp Ala Asp 

1525 1530 1535 

Phe Asp Leu Arg Met Glu Asp Asp Cys Leu Asp Leu Met Tyr Glu He 

1540 1545 ~ 1550 

Glu Gin Met Asn Glu Ala Arg Leu Pro Gin Val Cys His Glu Met Arg 

1555 1560 1565 

Arg Pro Leu Ala Glu Lys Gin Gin Lys Gin Asn Thr Leu Asn Ala Phe 

1570 1575 1580 

Asn Asp He Leu Ser Ala Lys Glu Lys Glu Ser Val Tyr Asp Ala Val 
1585 - 1590 " 1595 1600 

Asn Lys Cys Leu Gin Met Pro Gin Ser Glu Ala He Thr Ala Glu Ser 

1605 1610 1615 

Ala Ala Ser Pro Ala Tyr Thr Glu His Ser Ser Phe Ser Met Asp Asp 

1620 1625 1630 

Thr Ser Gin Asp Ala Lys He Glu Pro Ser Leu Thr Glu Asn Gin Gin 
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1635 1640 1645 

Pro Thr Thr Thr Ala Thr Thr Thr Thr Thr Val Pro Gin Gin Gin Gin 

1650 1655 1660 

Gin Gin Gin Gin Gin Lys Ser Ser Lys Lys Lys Arg Asn Asp Asn Arg 
1665 1670 ' 1675 1680 

Thr Ala Gin Asn Arg Thr Ala Glu Asn Gly Val Lys Arg Ala Thr Thr 

1685 1690 1695 

Pro Pro Pro Ser Trp Arg Glu Glu Pro Asp Tyr Asp Gly Ala Glu Trp 

1700 1705 1710 

Asn lie Val Glu Asp Tyr Ala Leu Leu Gin Ala Val Gin Val Glu Phe 

1715 1720 1725 

Ala Asn Ala His Leu Val Glu Lys Ser Ala Asn Glu Gly Met Val Leu 

1730 1735 1740 

Asn Trp Glu Phe Val Ser Asn Ala Val Asn Lys Gin Thr Arg Phe Phe 
1745 " 1750 1755 1760 

Arg Ser Ala Arg Gin Cys Ser lie Arg Tyr Gin Met Phe Val Arg Pro 

1765 1770 1775 

Lys Glu Leu Gly Gin Leu Val Ala Ser Asp Pro lie Ser Lys Lys Thr 

1780 1785 1790 

Met Lys Val Asp Leu Ser His Thr Glu Leu Ser His Leu Arg Lys Gly 

1795 * 1800 1805 

Arg Met Thr Thr Glu Ser Gin Tyr Ala His Asp Tyr Gly lie Leu Thr 

1810 1815 1820 

Asp Lys Lys His Val Asn Arg Phe Lys Ser Val Arg Val Ala Ala Thr 
1825 1830 1835 1840 

Arg Arg Pro Val Gin Phe Trp Arg Gly Pro Lys Gly Arg Gly Gly Trp 

1845 ~ 1850 1855 

Leu His Asn Ser His Cys Asn Phe Phe Leu Thr Arg Asp Glu Lys Lys 

1860 1865 1870. 

Trp Phe Leu Gly His Gly Arg Gly Ala Asp Lys Phe Gin 
1875 1880 1885 
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<212> DNA 
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<222> (1001) . . . (1035) 
<221> CDS 

<222> (1920) . . - (2062) 
<221> CDS 

<222> (2114) . . . (2190) 
<221> CDS 

<222> (2241) . . . (2501) 



<221> CDS 
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<221> CDS 
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<222> (3497) . . . (3631) 



<221> 
<222> 


CDS 

(4227) . . 


. (4690) 


<221> 
<222> 


CDS 

(5293) . . 


. (6058) 


<221> 
<222> 


CDS 

(6696) . . 


. (7058) 


<221> 
<222> 


CDS 

(7609) . . 


. (8338) 


<221> 
<222> 


CDS 

(8771) . . 


. (8933) 


<221> 
<222> 


CDS 

(9511) . . 


. (10306) 


<221> 
<222> 


CDS 

(10774) . 


. . (10851) 



<400> 24 

gtcaatggaa ttctcgacgc ggatcttgtt agagatgccg tcgagagaga tttgatcaaa 60 

ttgcggtacg ctgaaacgga tgcaccagtt ttacaggtaa aatggaaata tacaaactca 120 

aaagtaaaat tttatgaatt tcagatcaac aactcactat acacggcatc ctgggagcaa 180 

gatctcggaa caaatatggt tctgcagtca aaaggaaaag agatggaagt gatttcgtgt 240 

acatcgacca tgatgactgc agaaaaagcc ctgttgacct cgttaagcac cgaaggatct 3 00 

acactagccg ccaatgcaga gactgctccg aaatctgatc tcagtcgaac tcaaccacgt 360 

caacaatgat tttcaaaata taaattaaca tgaagctctg aaataaactc atataactgc 420 

taaaataaaa ctgttgcttt tgaaaccaac atttgttaga caacctgcgt ctcacagtca 480 

tttttcaata tattggcgcc gcgcacacac aaagaagaag aattcgtcct catggcatgg 540 

catgtgcagt cagcggccac cctgtgtaac cactgcgtat cgcatctttc cacgtgtttt 600 

tgcaatcttg ctgtcacgtt catttcctcg tacaaccatc tcttctaccc ccgttgcctc 660 

ctccaccatc tcatctcaat tgtgtcgttg ccctccctct ccccaagtct ttctgcgtct 720 

cttagtgctc ttcgagaaaa gaacgaggag agctgtgaga cgctagtagg aaacgcattc 780 

tcaattcgat ataggcacat tgagagagag cgagcgccgt ttcgacgtct tctagccttc 840 

acatcatcca gacgacgttc acacgcacac acagccaacc ccacccttct gacaacgaat 900 

agacgacgaa gaagagaaga agaaaaagaa gaaggtaccc atttttcatt ccctttttgc 960 
ctccacactt cactattatc gattttgtga gcgagctcta atg ttt caa cgc aaa 1015 

Met Phe Gin Arg Lys 
1 5 



gtg gta ttg cct aaa aag eg gtgagaattt gcttcagaca gaaattcgtt 
Val Val Leu Pro Lys Lys Arg 
10 



1065 



ttttttaaca agaaaaatcc ggtttcaatt 
ctcttcattg aeggaaaact cgtttttctt 
caaggtttgt tttaatttaa attaaaaata 
aatgttttct tcaaaaaatg cactcaggtt 
cgcaaaggag cgtcgttagc tgctaatcaa 
tgtactacac aeggacaagt gctccaccgt 
tcccattttg aegtttttet tttttttttc 
aatgataacc tgcaaataat aatgtaaaat 
tcataaattt ttgataaaaa agtgattttc 
acctgattct tcaattttta ttatatatta 



gtcgtagaag gtcaattttt actttcaacg 1125 
tcaaatttta aattacagag gcattttact 1185 
aattttaaaa tagaaatatg gataatataa 1245 
caccaaaaaa tcgataatta aaaatacggt 1305 
tggtcttaaa acgaaatcta tcgatttttg 1365 
tattttttga acgagtgcgt tgcaattcca 1425 
atcaaatttt ttagcattta aagtaaagtc 1485 
tcattaaaaa ccgagagaaa aagtctaaag 1545 
gaaactaaaa atcattcaaa ttaaagttga 1605 
aaagcttgat ccactcaaat aaaaggagtt 1665 
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tttaattgag aaaaaaagca aatgaaaaaa tcgataatta aattgggcgc caacctagat 1725 

tttaatatgt ttttgttaga aatttgtata ttttcatcac tctctgactt taagcattcg 1785 

tattttaagg aagtgtgagc tttctaatat gttttttatt aaaaaaaaca tgtttttaac 1845 

aatctccctg tcatccccat cacctaatgc actcaaataa tcaataatca caatactttt 1905 

attttttctt gcag a aca gaa atg gtc caa acg aga cga aag aca get gca 1956 

Thr Glu Met Val Gin Thr Arg Arg Lys Thr Ala Ala 
15 20 

get gta cag gac ggt ggt gec gtt aag gag aac aaa gec aag cca cct 2 004 
Ala Val Gin Asp Gly Gly Ala Val Lys Glu Asn Lys Ala Lys Pro Pro 
25 30 35 40 

gec cct caa acg cct aca aaa cga gca aaa cga ggt cgt ccc ccg aaa 2 052 
Ala Pro Gin Thr Pro Thr Lys Arg Ala Lys Arg Gly Arg Pro Pro Lys 
45 50 55 

att aag act g gtgagcgaat gaetataegg aagattgaaa attcacgtgg 2102 
lie Lys Thr 



aatacttgea g at gec aat act ttg aat acg cca age act tct tec aac 2151 
Asp Ala Asn Thr Leu Asn Thr Pro Ser Thr Ser Ser Asn 
60 65 70 

ttg gtc gat gac aaa ctt etc att gag tct gaa tea cag gtaaattgat 2200 
Leu Val Asp Asp Lys Leu Leu lie Glu Ser Glu Ser Gin 
75 80 85 

tcttttctat tcaaaaatta atctaaacta tacattccag gac teg att etc aca 2255 

Asp Ser lie Leu Thr 
90 

aac gaa gec gac tct ttt ctg gaa aaa gaa gtg gaa gaa ate gaa gat 2303 
Asn Glu Ala Asp Ser Phe Leu Glu Lys Glu Val Glu Glu He Glu Asp 
95 100 105 

agt tea gat ata ctt ccc gat aaa att aat tct cca gaa aaa cca agt 2351 
Ser Ser Asp He Leu Pro Asp Lys He Asn Ser Pro Glu Lys Pro Ser 
110 115 120 

gtt ttg gtg aag egg aga teg agt acg egg tta aaa gtg aag act gat 2399 
Val Leu Val Lys Arg Arg Ser Ser Thr Arg Leu Lys Val Lys Thr Asp 
125 * 130 135 

gaa gat gaa aaa gat gtt cct gtg aac ata gaa gta gee gtt tta gaa 2447 
Glu Asp Glu Lys Asp Val Pro Val Asn He Glu Val Ala Val Leu Glu 
140 ' * 145 150 

gaa aaa tea att caa ate gag cca aca tct ccc get cac ccg gaa gat 2495 
Glu Lys Ser He Gin He Glu Pro Thr Ser Pro Ala His Pro Glu Asp 
155 160 165 170 

cct cag gtgagctttt tttaaaaata tgtattaatc aaaattcctt catttccag cct 2553 
Pro Gin ~ * ~ Pro 

teg act tct tct ctt cca ctg gta gaa cca att gaa gac att gtg gag 2601 
Ser Thr Ser Ser Leu Pro Leu Val Glu Pro He Glu Asp He Val Glu 
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175 180 185 

cca aat gag cca aca age tct gec gat cct cca gta tea aat att aag 2649 
Pro Asn Glu Pro Thr Ser Ser Ala Asp Pro Pro Val Ser Asn He Lys 
190 195 200 205 

gat gag gat att aaa gaa gaa gag cca ctg att aaa aag cca get tec 2697 
Asp Glu Asp He Lys Glu Glu Glu Pro Leu He Lys Lys Pro Ala Ser 
210 215 220 

gat gag tea gaa tct atg gat ata get aac tct gaa agt gga aat gat 2745 
Asp Glu Ser Glu Ser Met Asp He Ala Asn Ser Glu Ser Gly Asn Asp 
225 230 235 

tec gat tea agt gaa get gat cct agg acg ata cca t.ct ttc tct ata 2793 
Ser Asp Ser Ser Glu Ala Asp Pro Arg Thr He Pro Ser Phe Ser He 
240 245 250 

cct ctt ccc gac aca cca cct cca aat ttt gcg aaa aga gga gaa ata 2841 
' Pro Leu Pro Asp Thr Pro Pro Pro Asn Phe Ala Lys Arg Gly Glu He 
255 260 265 

cat gta gat gta gat cag aaa aat tec aag caa tea gga gaa tea caa 2889 
His Val Asp Val Asp Gin Lys Asn Ser Lys Gin Ser Gly Glu Ser Gin 
270 275 280 285 

teg cct tgg gag eg gtaagaatat ttatcctagc caggtgttat aacaaaattg 2943 
Ser Pro Trp Glu Arg 
290 

aatagtttca g a gca aga gaa aag tct gca teg aac cca ttg tec tct 2991 
Ala Arg Glu Lys Ser Ala Ser Asn Pro Leu Ser Ser 
295 300 



cca 
Pro 


aca 
Thr 


atg 
Met 
305 


age 
Ser 


cga 
Arg 


ccc 
Pro 


agg 
Arg 


ata 
He 
310 


cac 
His 


ttc 
Phe 


ctt 
Leu 


cat 
His 


cca 
Pro 
315 


gca 
Ala 


tat 
Tyr 


caa 
Gin 


3039 


agt 
Ser 


ttc 
Phe 
320 


aca 
Thr 


aat 
Asn 


gat 
Asp 


tea 
Ser 


gtt 
Val 
325 


tea 
Ser 


cct 
Pro 


eta 
Leu 


cca 
Pro 


cca 
Pro 
330 


ccg 
Pro 


cca 
Pro 


cca 
Pro 


gag 
Glu 


3087 


ccg 
Pro 
335 


get 
Ala 


cca 
Pro 


get 
Ala 


cgt 
Arg 


gaa 
Glu 
340 


aaa 
Lys 


gtg 
Val 


gaa 
Glu 


aat ggt ggt 
Asn Gly Gly 
345 


cca 
Pro 


act 
Thr 


act 
Thr 


ttc 
Phe 
350 


3135 


aaa 
Lys 


atg 
Met 


act 
Thr 


ttc 
Phe 


aaa 
Lys 
355 


aaa 
Lys 


get 
Ala 


gca 
Ala 


aat 
Asn 


att 
He 
360 


cct 
Pro 


ate 
He 


ttg 
Leu 


aag 
Lys 


aca 
Thr 
365 


teg 
Ser 


3183 


gca 
Ala 


ttt 
Phe 


gaa 
Glu 


caa 
Gin 
370 


cca 
Pro 


tea 
Ser 


tea 
Ser 


cct 
Pro 


cca 
Pro 
375 


cct 
Pro 


tec 
Ser 


tea 
Ser 


tea 
Ser 


gtt 
Val 
380 


tct 
Ser 


tea 
Ser 


3231 


tea 
Ser 


att 
He 


tea 
Ser 
385 


tta 
Leu 


tct 
Ser 


gaa 
Glu 


gtg 
Val 


aat 
Asn 
390 


tct 
Ser 


tct 
Ser 


aca 
Thr 


teg 
Ser 


ata 
He 
395 


gee 
Ala 


tec 
Ser 


gag 
Glu 


3279 
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tct tct cca gcg aaa aga age tea aat ttc gat tta act gec tea aat 3327 
Ser Ser Pro Ala Lys Arg Ser Ser Asn Phe Asp Leu Thr Ala Ser Asn 
400 A 405 410 

gag ctt cca cca cct cag atg gtt gaa ctt ccc aag etc tea ttt ttc 3375 
Glu Leu Pro Pro Pro Gin Met Val Glu Leu Pro Lys Leu Ser Phe Phe 
415 420 425 430 

aat atg cct cca gec gtt cgc tec gca gag gttagttaac tttttccegg 3425 
Asn Met Pro Pro Ala Val Arg Ser Ala Glu 
435 440 

tttcatgaaa tttcagcggt atctgtcctc cttttggtgt gtgccctcac aacctaacct 3485 
cttttatcca g gac gat tct gcg atg acg teg gaa gaa ccg ate ctt etc 3535 
Asp Asp Ser Ala Met Thr Ser Glu Glu Pro lie Leu Leu 
445 450 

etc cgt tct ccg aat tec gee act cct gat gat gat gca ctt ttc etc 3583 
Leu Arg Ser Pro Asn Ser Ala Thr Pro Asp Asp Asp Ala Leu Phe Leu 
455 460 465 

acg ace cca cca cca ccc aag atg ace gaa tea gaa att caa gca ctg 3 631 
Thr Thr Pro Pro Pro Pro Lys Met Thr Glu Ser Glu lie Gin Ala Leu 
470 475 480 485 

gtgagecaga tcacacattt cgatgtcgtg tgtggaaccc aggaatttca gaccgttttt 3691 
ctttacacct catccccttt tgtgttatgt taacattcat tttgtgtctc aaacactgea 3751 
tgettttgea cttggaaatt aaaaaataat gcgttctggg attttgtgtg ttaaggtgga 3811 
gtagagtttg tgaggctaga aagtatgect ttttcgtttc tccactgcaa aatttcgttt 3871 
gaaaaaaaca aaaaatttac taaaatttga aatttcacca acttgeegtt gtcacagctg 3931 
ctgaaataca gtttttattg cattttcacc etttattgea tattattatt agacaccttt 3 991 
taggtcaata ggcaaccgaa aatatccgaa tttgacttaa aatgtaccta aattaaggaa 4051 
ctaacttgag atatacgact aaaaatgcaa taaattgtga gaattattgt tatgaaattc 4111 
ageegtttta ggctagtttt agecaaaaac cgacaaactc tattccaatt aattttccac 4171 
tcctgcacct cgattagtga ttttttgaag aaaaaaaatt atcttcttat ttcag aaa 4229 

Lys 

gta. gcg acg gaa aaa gtg aat caa gta att get cga cgt gaa gat tct 4277 
Val Ala Thr Glu Lys Val Asn Gin Val He Ala Arg Arg Glu Asp Ser 
490 495 500 

gaa aaa gat gta cgt cac aga gaa gat cga gat gat tat gat aga cga 4325 
Glu Lys Asp Val Arg His Arg Glu Asp Arg Asp Asp Tyr Asp Arg Arg 
505 510 515 

cgt gac gac cgt gac aga aga tec aga aag act gat teg gaa cga aat 4373 
Arg* Asp Asp Arg Asp Arg Arg Ser Arg Lys Thr Asp Ser Glu Arg Asn 
520 525 530 

gat caa aga gga cga caa cgt gaa gat gat gaa cga aga get cga gaa 4421 
Asp Gin Arg Gly Arg Gin Arg Glu Asp Asp Glu Arg Arg Ala Arg Glu 

535 * 540 - 545 ........ 55Q 

cga gaa aga gaa gtt acg aaa cga cat gat egg gaa agg gaa gag atg 4469 
Arg Glu Arg Glu Val Thr Lys Arg His Asp Arg Glu Arg Glu Glu Met 
555 560 565 
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cga tta cag aaa caa aaa gat gag gaa aga aga aag aaa gat gaa gag 
Arg Leu Gin Lys Gin Lys Asp Glu Glu Arg Arg Lys Lys Asp Glu Glu 
570 575 580 



4517 



gaa agg ata caa aaa gag aat gat gag aaa aaa caa aaa gag gat gaa 
Glu Arg lie Gin Lys Glu Asn Asp Glu Lys Lys Gin Lys Glu Asp Glu 
585 590 595 



4565 



gcc aaa atg gag gag gag aaa aag aag att aaa gag gag gaa atg aag 
Ala Lys Met Glu Glu Glu Lys Lys Lys lie Lys Glu Glu Glu Met Lys 
600 605 610 



4613 



att cct gaa ttt gag ttg att age gaa tea aaa tat ttg acg agg aat 
lie Pro Glu Phe Glu Leu lie Ser Glu Ser Lys Tyr Leu Thr Arg Asn 
615 620 625 630 



4661 



gcg aat aaa aag aag act gaa tec tta ac gtaagttatt atttataaat 
Ala Asn Lys Lys Lys Thr Glu Ser Leu Thr 
635 640 



4710 



ttgacttaaa 
aactagagtg 
atcgattttt 
aacacataaa 
tacagcaaca 
acaaaaattc 
ttttaccgga 
gtttcaagat 
ccaaaaaatt 
caatcaaaag 



aattgataac 
cgcctttaaa 
cttagttttt 
ttttattttg 
aaagctcaaa 
ggagattttt 
aaeggtatec 
tagttacaaa 
tatgaaatat 
ctcattatta 



tttcaaaatt 
gagtactgta 
cgttaaaaat 
aaaagtaatg 
attacagtac 
ctttttttcg 
ggaggaaaaa 
ctcttttcaa 
aatgtttttt 
tatttatatt 



aagtgattca 
atttcaaact 
aattcaacca 
agaaaaacta 
tttttaaagg 
tgtttttctg 
aaaaacgaaa 
aagcagattc 
agactagaaa 
tatataattc 



atagactcaa 
tttgttgctg 
ttggattaaa 
tagaaattcg 
agcacatctt 
gcgaaaaaac 
aaagcgaaaa 
tacagttttt 
aataaactaa 



aagaatgaaa 
ctcatttttc 
aaaaattaaa 
ccgaaaattc 
tctgaattta 
gatttttege 
attttaagaa 
tggggttttg 
ttttaatttt 



ag t tgc gaa tgc cat 
Cys Glu Cys His 



4770 
4830 
4890 
4950 
5010 
5070 
5130 
5190 
5250 
5305 



cga act ggt gga aac tgt teg gac aat act tgt gtg aat cgt gca atg 
Arg Thr Gly Gly Asn Cys Ser Asp Asn Thr Cys Val Asn Arg Ala Met 
645 ~ 650 655 660 



5353 



etc ace gag tgc cca tea tea tgt cag gtc aaa tgc aag aat caa cga 
Leu Thr Glu Cys Pro Ser Ser Cys Gin Val Lys Cys Lys Asn Gin Arg 
665 670 675 



5401 



ttt gca aag aaa aag tac gcg get gtt gaa gca ttc cac act gga ace 
Phe Ala Lys Lys Lys Tyr Ala Ala Val Glu Ala Phe His Thr Gly Thr 
680 " 685 690 



5449 



gcc aaa gga tgt gga ctt cga gca gtg aaa gac ata aaa aaa gga aga 
Ala Lys Gly Cys Gly Leu Arg Ala Val Lys Asp lie Lys Lys Gly Arg 
695 700 " 705 



5497 



ttc ate att gaa tat ata gga gaa gtt gtg gaa aga gat gat tat gag 
Phe lie lie Glu Tyr lie Gly Glu Val Val Glu Arg Asp Asp Tyr Glu 
710 715 720 



5545 



aag aga aaa acg aaa tat gca get gat aaa aag cac aaa cat cat tat 

Lys Arg Lys Thr Lys Tyr Ala Ala Asp Lys Lys His Lys His His Tyr 

725 730 735 740 

etc tgt gat act gga gtc tac acg ate gac gca aca gtc tac gga aat 



5593 



5641 
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Leu Cys Asp Thr Gly Val Tyr Thr He Asp Ala Thr Val Tyr Gly Asn 
745 750 755 



cca tct cga ttt gtg aat cat agt tgt gat cct aat get ata tgt gag 
Pro Ser Arg Phe Val Asn His Ser Cys Asp Pro Asn Ala He Cys Glu 
760 765 770 



5689 



aaa tgg tct gta cca aga act cct gga gac gtt aat cga gtt ggt ttc 
Lys Trp Ser Val Pro Arg Thr Pro Gly Asp Val Asn Arg Val Gly Phe 
, 775 780 785 



5737 



ttc teg aaa cga ttc att aaa gec ggc gaa gaa ate aca ttt gat tat 
Phe Ser Lys Arg Phe He Lys Ala Gly Glu Glu He Thr Phe Asp Tyr 
790 795 800 



5785 



caa ttt gtc aac tac gga cgt gac get caa caa tgt ttc tgt gga agt 
Gin Phe Val Asn Tyr Gly Arg Asp Ala Gin Gin Cys Phe Cys Gly Ser 
805 810 815 820 



5833 



get tea tgt agt gga tgg att ggg cag aaa ccg gaa gaa ttt tea tct 
Ala Ser Cys Ser Gly Trp He Gly Gin Lys Pro Glu Glu Phe Ser Ser 
825 830 835 



5881 



gat gag gat gat gat att gtg act aca agg cat att aat atg gat gaa 
Asp Glu Asp Asp Asp He Val Thr Thr Arg His He Asn Met Asp Glu 
840 845 850 



5929 



gaa gaa gaa gaa aag ttg gaa ggt ctt gat cat ctt gga aat cat gaa 
Glu Glu Glu Glu Lys Leu Glu Gly Leu Asp His Leu Gly Asn His Glu 
855 860 865 



5977 



egg aat gaa gtg ate aag gat atg ttg gat gat ttg gtc att egg aat 6025 
Arg Asn Glu Val He Lys Asp Met Leu Asp Asp Leu Val He Arg Asn 
870 875 880 

aag aag cat get agg aag gtt ate aca att gcg gtaagcattt atttgtagag 6078 
Lys Lys His Ala Arg Lys Val He Thr He Ala 
885 890 895 

aaaatttaaa aattaaagat ggagtaccga aatccgagaa atatatttaa ttgactccaa 613 8 

tttttcctct gattccgaat ttttaaatga aaaaattcaa aaaaatttcc ttgattttat 6198 

gttttaactt gaaattgega atttcatttg tacagatttt tgaaacgccg aattttcgcg 6258 

ccagagaagc catgtgtcga tttttgagat ttgtgtatat ttacaagatt ttgaatcttc 6318 

ateggatget gatttgegtt tttcatcatt atattatcaa aaaactaaca atttgttcgg 6378 

ttttacggaa attaacaata tagactagac atttegtaaa tatacacaaa tetegtaaat 6438 

cgacacatgg cgtctctggc gegaaaatte ggcatttgaa aaatcttatg egggcactaa 6498 

tgaaattcgt gatttcaagc tgaaatataa aatcagggaa ttttccttgc attttttcac 6558 

tcagaacttc ggaatcagtt gcaaatttgg agtcatttga aaatatttct cagatttegg 6618 

tactccacct ttattataat ttttaaaatt ttttaaatga ttttttttcc atgttcaaca 6678 

aaaaaataaa ttttcag tct gca atg ace gat tac tct caa cgt gtg gat 6728 

Ser Ala Met Thr Asp Tyr Ser Gin Arg Val Asp 

900 905 

gtc att caa gaa ate ttc tec tea gac ace tec gta ace gtt caa aaa 6776 
Val He Gin Glu He Phe Ser Ser Asp Thr Ser Val Thr Val Gin Lys 
910 915 920 



ttc tat gca aaa gag gga atg get aca ttg atg get gaa tgg ttg tct 6824 



-55- 



WO 2004/024084 



PCT/US2003/028626 



Phe Tyr Ala Lys Glu Gly Met Ala Thr Leu Met Ala Glu Trp Leu Ser 
925 " 930 935 

gaa gat gat tat teg ctg gat aat ctg aaa ctt gtt caa get att etc 6872 
Glu Asp Asp Tyr Ser Leu Asp Asn Leu Lys Leu Val Gin Ala lie Leu 
940 " 945 950 

aaa get ctt cac act gaa eta ttc gat teg tgc gee aaa aat gat cga 6920 
Lys Ala Leu His Thr Glu Leu Phe Asp Ser Cys Ala Lys Asn Asp Arg 
955 960 965 970 

etc tta cga gat tct aca tea cga tgg gtc aat gcg aaa atg gat gaa 6968 
Leu Leu Arg Asp Ser Thr Ser Arg Trp Val Asn Ala Lys Met Asp Glu 
975 980 985 

tat gtt gat ata caa gtg ata get gat tea ctt att get tgt gtt gaa 7016 
Tyr Val Asp He Gin Val He Ala Asp Ser Leu He Ala Cys Val Glu 
990 995 1000 



gat ccc gta cag gag tac aag gat gtt tgc aaa gtt ata gag 
Asp Pro Val Gin Glu Tyr Lys Asp Val Cys Lys Val He Glu 
1005 1010 1015 



7058 



gtatatacat attaattttt aaaaaagaat attttttgea tgtcacaaaa tatttggaaa 7118 
ttttcccgaa aaacccatga aatcaaaaaa caaattaaat agtaaaatta tttcctccta 7178 
cgaacatttt tcgatttttc gttttccgat attcctttta aaaatctgat ttaaaaaaaa 7238 
aaaacttaaa ttttaggtct ttttgetect ttttagaagc aatttatatg ttttttaaaa 7298 
caaaacttaa aattagcatt tttatgggta attttctgaa cacatttttt tttcgaaaaa 7358 
aatggccaga atttcaacca cttctccgta aaatcgaaat taactaattt tttctctata 7418 
catttttcaa aaaaagactc ctcatttatt gtattagata caaatatatg ttttcctcat 7478 
caaaatttac gaaatttgtt ataattttga attttttttg tttttttttc gaaaaattga 7538 
aaattttcta attttgaaac gatattatac aatttcagcg ccatcaattt aactaattaa 7598 
ataatttcag aaa ggt etc gtc gaa aac ttc aca aga gec aaa gag atg 7647 
Lys Gly Leu Val Glu Asn Phe Thr Arg Ala Lys Glu Met 
1020 1025 

gee tat egg tta aat caa tac tgg ttc aat cga tea gtg age ttc aaa 7695 
Ala Tyr Arg Leu Asn Gin Tyr Trp Phe Asn Arg Ser Val Ser Phe Lys 
1030 1035 1040 1045 



att cca aaa aag ata cgt gat cct gtg cca aaa gat gtt cca gtc aga 
He Pro Lys Lys He Arg Asp Pro Val Pro Lys Asp Val Pro Val Arg 
1050 1055 1060 



7743 



caa gaa gat get aca aca tea tea caa tct cat gat aat agt agt aga 
Gin Glu Asp Ala Thr Thr Ser Ser Gin Ser His Asp Asn Ser Ser Arg 
1065 1070 1075 



7791 



act gta tea ccg aat cat cga cat cat tea tct tea tat tea aat tea 7839 
Thr Val Ser Pro Asn His Arg His His Ser Ser Ser Tyr Ser Asn Ser 
1080 1085 1090 

tgt tat caa gaa cga gaa cca tct cat ata cga ttc ttt aat aat gga 7887 
Cys Tyr Gin Glu Arg Glu Pro Ser His He Arg Phe Phe Asn Asn Gly 
1095 HOO 1105 

aat gat gtt cat caa tat cgt ttt gga ggt tat cat gga aat aac tac 7935 
Asn Asp Val His Gin Tyr Arg Phe Gly Gly Tyr His Gly Asn Asn Tyr 
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1110 1115 1120 1125 

aat gat aac tat ttc agt aga agg ccc aat aag gat tea tat cga gat 7983 
Asn Asp Asn Tyr Phe Ser Arg Arg Pro Asn Lys Asp Ser Tyr Arg Asp 
1130 1135 1140 

cgc cgt cga ttt aat gga cgt cgt teg aga agt cga tea aga agt gtc 8031 
Arg Arg Arg Phe Asn Gly Arg Arg Ser Arg Ser Arg Ser Arg Ser Val 
1145 1150 1155 

tea cca cag aac tat aaa aga aga aaa etc gat gaa cat gac aat aat 8079 
Ser Pro Gin Asn Tyr Lys Arg Arg Lys Leu Asp Glu His Asp Asn Asn 
1160 1165 1170 

cat cgt cag cgt tct cca att cgt gat cgt cac aca tct ccc ggc ggc 8127 
His Arg Gin Arg Ser Pro He Arg Asp Arg His Thr Ser Pro Gly Gly 
1175 ' 1180 1185 

gaa aag act cct age teg aat aat tct gga gaa cga aac tat aaa aga 8175 
Glu Lys Thr Pro Ser Ser Asn Asn Ser Gly Glu Arg Asn Tyr Lys Arg 
1190 1195 1200 1205 

ctg gat att cga gga get cgt ata aaa act ata aaa gaa gat ttg gaa 8223 
Leu Asp He Arg Gly Ala Arg He Lys Thr He Lys Glu Asp Leu Glu 
1210 1215 1220 

get get get get get get get get get get gta cca tea gaa gtg caa 8271 
Ala Ala Ala Ala Ala Ala Ala Ala Ala Ala Val Pro Ser Glu Val Gin 
1225 1230 1235 

get tat cct cat gaa cat aca get gta cat cag agt gtt tat cag atg 8319 
Ala Tyr Pro His Glu His Thr Ala Val His Gin Ser Val Tyr Gin Met 
1240 1245 * 1250 

cca ggt tat gag tct tat g gttggtttag tttttttaaa aatatcattt 8368 
Pro Gly Tyr Glu Ser Tyr 
1255 

accagggtgc catttttaaa aataaaaata acteggaaaa tatgttttta aaaaatttca 8428 
gaa^ttctct catcaacata aaacttgata aaaatcgaat ttttattatt ttctaaacat 8488 
tttttcggtt tttccgaaaa tcaaaaaaaa agtttagaaa atagcaaaaa atcagtttat 8548 
tagaaatcaa attttgttcg ttttgataag aaaaaacata agaaaacatg ttattttctt 8608 
ctgaaaaaag aaaaaaatcg aaaaatctat ggccttttgg caaaatgttt tggaccaaaa 8668 
aacaaaacaa atagcattaa aattattagt tcttttgttt tcttctaaag ttaattttct 8728 
gaaagtcttg ettgtegtat atcaaataaa aacatttttc ag ga gta tat gat cct 8784 

Gly Val Tyr Asp Pro 

1260 

gta aat ggt gtc tac atg tat cct cat cct ggc get ggt tac tat cca 8832 
Val Asn Gly Val Tyr- Met Tyr Pro His Pro Gly Ala Gly Tyr Tyr Pro 
1265 * 1270 1275 1280 

cct gee tat cca caa caa ccg att atg tta aca atg gac act ctt cca 8880 
Pro Ala Tyr Pro Gin Gin Pro He Met Leu Thr Met Asp Thr Leu Pro 
1285 1290 1295 

ccg aat gat cgt ctt ggt gaa ctt tac gag aaa gee agt ate gag cag 8928 
Pro Asn Asp Arg Leu Gly Glu Leu Tyr Glu Lys Ala Ser He Glu Gin 



-57 



WO 2004/024084 



PCT/US2003/028626 



1300 



1305 



1310 



eta gc gtgagcattt tttagtttaa acctttegga tttacctaga aaaatgttac 
Leu Ala 



8983 



etttgacgea 
gtagcttccc 
catgaaaaac 
caaaaactga 
atccgaaaaa 
gaaaaaaaga 
aaatcatttt 
ttttccggaa 
gccactcgaa 



aaattacggt 
atgaattttt 
tgaataaaaa 
ccggctaccg 
attgaaattt 
atgtcggttt 
eggtaattte 
ttttaataaa 
tataataaca 



agcaggtctc 
ttgetgaact 
ttgatatctt 
taatttttcg 
ttactactcg 
ttcgaatttt 
cctaaatttg 
aaatcaattt 
catgaaataa 



gtcgcgaccg 
tatctttctg 
taccttatag 
tcaaaagtca 
tccgactgtt 
cgattttcaa 
taaaatataa 
tegegtaaca 
aattaaaatt 



aaatttttca 
ataacaaata 
gctctttaag 
cacatttctc 
tagaaaagat 
agaaaaaaat 
tttccaataa 
aaaatgcgaa 
attacag t 



gcggagtacg 
gtaactaaaa 
ggegcagaca 
aactggtgaa 
taaaaaaaaa 
caatatttaa 
atgttttttg 
aaaatgacta 
caa cga gat 
Gin Arg Asp 
1315 



9043 
9103 
9163 
9223 
9283 
9343 
9403 
9463 
9520 



gca att gtg aga caa gaa ctt gag ctg ata cgt att caa ate gaa aga 
Ala lie Val Arg Gin Glu Leu Glu Leu lie Arg He Gin He Glu Arg 
1320 1325 1330 



9568 



aaa act get caa aaa gaa gcg ate aag gee get tgc cgt cgt get aac 9616 
Lys Thr Ala Gin Lys Glu Ala He Lys Ala Ala Cys Arg Arg Ala Asn 
1335 1340 1345 

gaa gaa gaa get aaa cga caa gag gca ctt gca aag acg aaa tat gtt 9664 
Glu Glu Glu Ala Lys Arg Gin Glu Ala Leu Ala Lys Thr Lys Tyr Val 
1350 1355 1360 1365 

tgg gcg att gca aag tea gaa get gga gag acg tat tac tac aac aaa 9712 
Trp Ala He Ala Lys Ser Glu Ala Gly Glu Thr Tyr Tyr Tyr Asn Lys 
1370 1375 1380 

ata aca aaa gag acg cag tgg aca gca cca aca cca gtt caa ggt ctt 9760 
He Thr Lys Glu Thr Gin Trp Thr Ala Pro Thr Pro Val Gin Gly Leu 
1385 1390 1395 

etc gaa ccg get tgt ggt gca tct cct gat act aca gtt gtc att get 9808 
Leu Glu Pro Ala Cys Gly Ala Ser Pro Asp Thr Thr Val Val He Ala 
1400 1405 1410 



gac gag att act gaa gaa gag caa caa get gaa gtt ctg gag aag ccg 
Asp Glu He Thr Glu Glu Glu Gin Gin Ala Glu Val Leu Glu Lys Pro 
1415 1420 1425 



caa aaa gaa tct ccg gag aaa gtt cga gtt gtt gta ccg aaa gtt gaa 
Gin Lys Glu Ser Pro Glu Lys Val Arg Val Val Val Pro Lys Val Glu 
1450- -- 1455 ' 1460.. • 



9856 



cgt gtt gtt aag gaa gaa gtt ate gag cca ggt tea caa tct gaa act 9904 
Arg Val Val Lys Glu Glu Val He Glu Pro Gly Ser Gin Ser Glu Thr 
1430 1435 1440 1445 



9952 



gtt gaa aga tea ccg teg cca aaa tct tct cgt gat cgt gag aag gat 10000 
Val Glu Arg Ser Pro Ser Pro Lys Ser Ser Arg Asp Arg Glu Lys Asp 
1465 1470 1475 
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cga gag aaa tct cgt gag aaa gat cgt gaa aga gat cgt gac aga aga 10048 
Arg Glu Lys Ser Arg Glu Lys Asp Arg Glu Arg Asp Arg Asp Arg Arg 
1480 1485 1490 

gaa ggt tea aaa cat cgt gat agt tat cat gga cat cga aac ggc age 10096 
Glu Gly Ser Lys His Arg Asp Ser Tyr His Gly His Arg Asn Gly Ser 
1495 1500 . ~ 1505 

agt tct gtc agt gaa cga cgt atg cga gag ttc aaa cat gag ctg gaa 10144 
Ser Ser Val Ser Glu Arg Arg Met Arg Glu Phe Lys His Glu Leu Glu 
1510 1515 " 1520 1525 

cga tec act cga tct gec gtt cgt tct cgt eta caa cat caa cgt gac 10192 
Arg Ser Thr Arg Ser Ala Val Arg Ser Arg Leu Gin His Gin Arg Asp 
1530 1535 1540 

get tct agt gat aag act act tgg ctt att aag tta ata tat cga gag 10240 
Ala Ser Ser Asp Lys Thr Thr Trp Leu lie Lys Leu lie Tyr Arg Glu 
1545 1550 1555 

att ttc aaa cga gaa agt gcg cag agt gga ttt gat tat cga ttc agt 10288 
lie Phe Lys Arg Glu Ser Ala Gin Ser Gly Phe Asp Tyr Arg Phe Ser 
1560 " 1565 1570 

gag aat act gat aag aag gtaatattat ggaccaaaaa ataaacaatt 10336 
Glu Asn Thr Asp Lys Lys 
1575 

gaaaaaaaaa ccaaaaaaat ctgatgcttg aatttaaaaa aaaacaatga aagagtgcaa 10396 

ttttttaggt tttttggtct ttttttttgg aaaaaccaaa aaataaattt ttttccaaag 10456 

taccaaactt cattttaaaa aattttattt gacataaaaa ttgataattt aaaactaatt 10516 

tgaacatttt teegcaaaaa ttatagattt ttctgccaat tttagatttt taacgttttt 10576 

tttcggacaa ttaatgtttc gaatcatcaa tcagaatgaa tatgatatct gatgaaattc 10636 

aaaaataatg caatttaaat agaaaaeggt acaaaagttt tgaaaaattt agaagaattc 10696 

taaaaaaaat cctgtccttc aggacaaaat tcaacctttt tctcaaaaca caaaaattac 10756 

tttatattat ttttcag gtg aaa aac tac gtc aag tea tat ate gac cga 10806 

Val Lys Asn Tyr Val Lys Ser Tyr lie Asp Arg 
1580 1585 1590 

aaa etc gaa tea aac gat etc tgg aaa gaa tae tct egg cca tga 10851 
Lys Leu Glu Ser Asn Asp Leu Trp Lys Glu Tyr Ser Arg Pro * 
1595 1600 

gctttatttt ttaatttaaa ttttataaaa aaatgtttat gcttgttttt ttctctatag 10911 
ttccctccta tcccccccct cccctatcgc ctaaaaattg atctctgtct gatttcaccg 10971 
atttccgttt tatttgatcc cattgaacga gtatatcatc atgttcctga acttcaacgt 11031 
tegcacattt tattccccta gttttatgtc cccagaattg ttttatacta tcctgtaatc 11091 
cacctcaaaa tgacagecat gaaaagctgt ttttcatgtt ttctattttc ttgttgatcg 11151 
tatttgegee gctctttgtc gecaaatttt tttttgtaat taaaaaatga attaeggatg 11211 
ttgaattttt aaatttattt ttttaaagaa aaattgtgga agtttttcag attctatact 11271 
gcttattttt aegctaaatt ttttttcgaa gtcccctttt ttcaaatcga agtgtaactg 11331 
cgctccacga tcaatagaga ctctccgccc tcgaaccatg ggtctcgtta ggtatttggc 11391 
agacttaccg taaattcaaa- tgttttatta ettegegact aattttttta ttcatgactc 11451 
aattttttat caattccaac gaaaaactaa ttaaaaacaa eggaaaacat aacgaaaaat 11511 
gcttgaaaat tgcagacatt tccgaaatta attaaattcc taacgagacc catggctegg 11571 
gggcggagtg ttttcgatta gecatggage gcgttgagat attcctaaat ttttctattc 11631 
agatgtcgaa tcaatcaaaa egggtcacag tgagaattga gcattcgaag aacacttttt 11691 
tcgaaaagta attttcaaat tttgatccaa agaaattatt cgtcaatttt cagagtttta 11751 
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aaattccaac atcaagagca agaagatcgg aagctcaaat atgttctgca caaagctcac 11811 
gagaatctga gaaagtgccc attcgagatt ctgacaattg 11851 

<210> 25 
<211> 1604 
<212> PRT 

<213> Caenorhabditis elegans 
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Asn Asp 


Gin 




535 




Arg 


Glu Arg 


Glu 




550 




valU 


net Arg 


Leu 


565 






Glu 


Glu Glu 


Arg 


Asp 


Glu Ala 


Lys 






600 


Met 


Lys He 


Pro 




615 




Arg 


Asn Ala 


Asn 




630 




Arg 


Thr Gly 


Gly 


645 






Leu 


Thr Glu 


Cys 


Phe 


Ala Lys 


Lys 






680 


Ala 


Lys Gly 


Cys 




695 




Phe 


He He 


Glu 




710 




Lys 


Arg ijys 


inr 


725 






Leu 


Cys Asp 


Thr 


Pro 


Ser Arg 


Phe 






760 


Lys 


Trp Ser 


Val 




775 




Phe 


Ser Lys 


Arg 




790 




vj±n 


fne ..vaj. 


Asn 


805 






Ala 


Ser Cys 


Ser 


Asp 


Glu Asp 


Asp 






840 


Glu 


Glu Glu 


Glu 



395 



Asp 


Leu 


Thr 


Ala 




^ X v 






Pro 


Lys 


Leu 


Ser 


425 








Asp 


Asp 


Ser 


Ala 


Pro 


Asn 


Ser 


Ala 








460 


Pro 


Pro 


Pro 


Lys 






475 


Thr 


Glu 


Lys 


Val 




490 

•a ^ v 






Asp 


Val 


Arg 


His 


505 








Asp 


Arg 


Asp 


Arg 


Arg 


Gly 


Arg 


Gin 








540 


Arg 


Glu 


Val 


Thr 






555 




Gin 


Lys 


Gin 


Lys 




C *7 (\ 






He 


Gin 


Lys 


Glu 


585 








Met 


Glu 


Glu 


Glu 


Glu 


Phe 


Glu 


Leu 








620 


Lys 


Lys 


Lys 


Thr 






635 




Asn 


Cys 


Ser 


Asp 










Pro 


Ser 


Ser 


Cys 


665 








Lys 


Tyr 


Ala 


Ala 


Gly 


Leu 


Arg 


Ala 








700 


Tyr 


He 


Gly 


Glu 






715 




Lys 


Tyr 


Ala 


Ala 




TJft 
/ jU 






Gly 


Val 


Tyr 


Thr 


745 








Val 


Asn 


His 


Ser 


Pro 


Arg 


Thr 


Pro 








780 


Phe 


He 


Lys 


Ala 






795 




Tyr 


Gly 


Arg 


Asp 




o JLU 






Gly 


Trp 


He 


Gly 


825 








Asp 


He 


Val 


Thr 


Lys 


Leu 


Glu 


Gly 



61- 



400 



Ser 


Asn 


Glu 


Leu 






415 




Phe 


Phe 


Asn 


Met 




A 7 ft 






Met 


Thr 


Ser 


Glu 


445 








Thr 


P k ro 


Asp 


Asp 


Met 


Thr 


Glu 


Ser 








480 


Asn 


Gin 


Val 


He 






495 




Arg 


Glu 


Asp 


Arg 




cm 

D -L U 






Arg 


Ser 


Arg 


Lys 


525 








Arg 


Glu 


Asp 


Asp 


Lys 


Arg 


His 


Asp 








560 


Asp 


Glu 


Glu 


Arg 






575 




Asn 


Asp 


Glu 


Lys 




CQft 

590 






Lys 


Lys 


Lys 


He 


605 








He 


Ser 


Glu 


Ser 


Glu 


Ser 


Leu 


Thr 








640 


Asn 


Thr 


Cys 


Val 






655 




Gin 


Val 


Lys 


Cys 




b /U 






Val 


Glu 


Ala 


Phe 


685 








Val 


Lys 


Asp 


He 


Val 


Val 


Glu 


Arg 








720 


Asp 


Lys 


Lys 


His 






735 




He 


Asp 


Ala 


Thr 




7SU 






Cys 


Asp 


Pro 


Asn 


765 








Gly 


Asp 


Val 


Asn 


Gly 


Glu 


Glu 


He 








800 


Ala 


Gin 


Gin 


Cys 






815 




Gin 


Lys 


Pro 


Glu 




830 






Thr 


Arg 


His 


He 


845 








Leu 


Asp 


His 


Leu 
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850 855 860 

Gly Asn His Glu Arg Asn Glu Val He Lys Asp Met Leu Asp Asp Leu 
865 870 875 880 

Val He Arg Asn Lys Lys His Ala Arg Lys Val He Thr He Ala Ser 

885 890 895 

Ala Met Thr Asp Tyr Ser Gin Arg Val Asp Val He Gin Glu He Phe 

900 905 910 

Ser Ser Asp Thr Ser Val Thr Val Gin Lys Phe Tyr Ala Lys Glu Gly 

915 920 925 

Met Ala Thr Leu Met Ala, Glu Trp Leu Ser Glu Asp Asp Tyr Ser Leu 

930 935 940 

Asp Asn Leu Lys Leu Val Gin Ala He Leu Lys Ala Leu His Thr Glu 
945 950 955 960 

Leu Phe Asp Ser Cys Ala Lys Asn Asp Arg Leu Leu Arg Asp Ser Thr 

965 970 975 

Ser Arg Trp Val Asn Ala Lys Met Asp Glu Tyr Val Asp He Gin Val 

980 985 990 

He Ala Asp Ser Leu He Ala Cys Val Glu Asp Pro Val Gin Glu Tyr 

995 1000 1005 

Lys Asp Val Cys Lys Val He Glu Lys Gly Leu Val Glu Asn Phe Thr 

1010 1015 1020 

Arg Ala Lys Glu Met Ala Tyr Arg Leu Asn Gin Tyr Trp Phe Asn Arg 
1025 1030 1035 1040 

Ser Val Ser Phe Lys He Pro Lys Lys He Arg Asp Pro Val Pro Lys 

1045 1050 1055 

Asp Val Pro Val Arg Gin Glu Asp Ala Thr Thr Ser Ser Gin Ser His 

1060 ~ 1065 1070 

Asp Asn Ser Ser Arg Thr Val Ser Pro Asn His Arg His His Ser Ser 

1075 1080 1085 

Ser Tyr Ser Asn Ser Cys Tyr Gin Glu Arg Glu Pro Ser His He Arg 

1090 1095 1100 

Phe Phe Asn Asn Gly Asn Asp Val His Gin Tyr Arg Phe Gly Gly Tyr 
1105 1110 1115 1120 

His Gly Asn Asn Tyr Asn Asp Asn Tyr Phe Ser Arg Arg Pro Asn Lys 

1125 1130 1135 

Asp Ser Tyr Arg Asp Arg Arg Arg Phe Asn Gly Arg Arg Ser Arg Ser 

1140 1145 1150 

Arg Ser Arg Ser Val Ser Pro Gin Asn Tyr Lys Arg Arg Lys Leu Asp 

1155 1160 1165 

Glu His Asp Asn Asn His Arg Gin Arg Ser Pro He Arg Asp Arg His 

1170 1175 1180 

Thr Ser Pro Gly Gly Glu Lys Thr Pro Ser Ser Asn Asn Ser Gly Glu 
1185 ~ * 1190 1195 1200 

Arg Asn Tyr Lys Arg Leu Asp He Arg Gly Ala Arg He Lys Thr He 

1205 1210 1215 

Lys Glu Asp Leu Glu Ala Ala Ala Ala Ala Ala Ala Ala Ala Ala Val 

1220 1225 1230 

Pro Ser Glu Val Gin Ala Tyr Pro His Glu His Thr Ala Val His Gin 

1235 1240 1245 1 

Ser Val Tyr Gin Met Pro Gly Tyr Glu Ser Tyr Gly Val Tyr Asp Pro 
1250 1255 1260 

- Val Asn Gly Val Tyr Met Tyr ..Pro His Pro Gly Ala Gly Tyr Tyr Pro 
1265 1270 1275 1280 

Pro Ala Tyr Pro Gin Gin Pro He Met Leu Thr Met Asp Thr Leu Pro 

1285 1290 1295 

Pro Asn Asp Arg Leu Gly Glu Leu Tyr Glu Lys Ala Ser He Glu Gin 

1300 1305 1310 

Leu Ala Gin Arg Asp Ala He Val Arg Gin Glu Leu Glu Leu He Arg 
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1315 1320 1325 

He Gin He Glu Arg Lys Thr Ala Gin Lys Glu Ala He Lys Ala Ala 

1330 1335 1340 

Cys Arg Arg Ala Asn Glu Glu Glu Ala Lys Arg Gin Glu Ala Leu Ala 
1345 1350 1355 1360 

Lys Thr Lys Tyr Val Trp Ala He Ala Lys Ser Glu Ala Gly Glu Thr 

1365 1370 1375 

Tyr Tyr Tyr Asn Lys He Thr Lys Glu Thr Gin Trp Thr Ala Pro Thr 

1380 1385 ' 1390 

Pro Val Gin Gly Leu Leu Glu Pro Ala Cys Gly Ala Ser Pro Asp Thr 

1395 1400 1405 

Thr Val Val lie Ala Asp Glu He Thr Glu Glu Glu Gin Gin Ala Glu 

1410 1415 1420 

Val Leu Glu Lys Pro Arg Val Val Lys Glu Glu Val He Glu Pro Gly 
1425 " 1430 1435 1440 

Ser Gin Ser Glu Thr Gin Lys Glu Ser Pro Glu Lys Val Arg Val Val 

1445 1450 1455 

Val Pro Lys Val Glu Val Glu Arg Ser Pro Ser Pro Lys Ser Ser Arg 

1460 1465 1470 

Asp Arg Glu Lys Asp Arg Glu Lys Ser Arg Glu Lys Asp Arg Glu Arg 

1475 1480 1485 

Asp Arg Asp Arg Arg Glu Gly Ser Lys His Arg Asp Ser Tyr His Gly 

1490 1495 1500 

His Arg Asn Gly Ser Ser Ser Val Ser Glu Arg Arg Met Arg Glu Phe 
1505 1510 1515 1520 

Lys His Glu Leu Glu Arg Ser Thr Arg Ser Ala Val Arg Ser Arg Leu 

1525 1530 1535 

Gin His Gin Arg Asp Ala Ser Ser Asp Lys Thr Thr Trp Leu He Lys 

1540 1545 1550 

Leu He Tyr Arg Glu He Phe Lys Arg Glu Ser Ala Gin Ser Gly Phe 

1555 1560 1565 

Asp Tyr Arg Phe Ser Glu Asn Thr Asp Lys Lys Val Lys Asn Tyr Val 

1570 1575 1580 

Lys Ser Tyr He Asp Arg Lys Leu Glu Ser Asn Asp Leu Trp Lys Glu 
1585 1590 1595 1600 

Tyr Ser Arg Pro 



<210> 


26 


<211> 


7333 


<212> 


DNA 


<213> 


Caenor habdi t i s 


<220> 




<221> 


CDS 


<222> 


(1001) . . . (1096) 


<221> 


CDS 


<222> 


(1166) . . . (1453) 


<221> 


CDS 


<222> 


(1501) . . . (2199) 


<221> 


CDS 


<222> 


(2298) . . . (2730) 


<221> 


CDS 
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<222> (3234) . . . (3847) 
<221> CDS 

<222> (4148) . - . (5778) 
<221> CDS 

<222> (6111) . . . (6333) 



<400> 26 

gcttgcatcg aaactcttct cattatttac gtgatgatca catctttcgt tgggctgtac 6 0 
tcccttccgg ttcttcgttc tcttcgacct gttcgaaaag atactccaat gccaacgata 120 
attattaatt cttcaatagt tcttgttgtt gcatccgctc tcccagtagc tgttaacaca 180 
gttggaatga caacttttga tcttctcggc tcccactcat cgctccaatg gcttggatca 240 
tttcgagtcg ttgttgccta taatactcta t'tcgtcgtgt tgtctgtcgc atttctcttc 3 00 
aatcaattga ctgcttcaat gagaaggcaa atctggaagt ggtaagctgt gcaatttaaa 3 60 
gtttaaattc ttattaattt ttttgcagga tatgtcaact acgatgtgga atcagacggg 420 
agagtgatgc ggatgaaacc attgagatcc ttagaggcga taagaaaagc aattgaattt 480 
ctttcctttt tcaacacttc ttacccatgt tcatcatttt aatcttttca ttacaaaaac 540 
aaggtcctat tttttttctc gggtactact cgccttttct aataattcag aatcatcaat 600 
ttttgccaac ctctagcttt acatgtctgt ttttcatcat tttctctcaa gcattctcct 660 
aatatattat gttccctagt atttcccctc agtcagcaat tttctcgtcg tcgaaaccgt 720 
ttagctttac tttcaatcaa aacgtggaac atttttcaaa ctatttgaag ccaaaaaaaa 780 
ccagggcttt tgtatatgta ccatattttc cctctgattt tctttatcgc cttctctttt 840 
catgtagaat aactgaaata caaaccattt taattttttc ttttaattat caatactgtc 900 
cgtataggta aaaattattt cttcaggttt gaaaaaatcc gaaatatgta tctgcaactc 960 
ttcagggcat tgcctcaatt aatttttatc taatattcag atg gac caa caa gaa 1015 

Met Asp Gin Gin Glu 
1 5 

cca teg aat aac gta gat acg age agt att ctt teg gat gat ggg atg 1063 
Pro Ser Asn Asn Val Asp Thr Ser Ser lie Leu Ser Asp Asp Gly Met 
10 15 20 



gaa aca cag gaa caa agt tea ttc gtc act get gtgagtgaaa ttatttaaaa 
Glu Thr Gin Glu Gin Ser Ser Phe Val Thr Ala 
25 30 



1116 



tttegctteg gagattcatt gtcatataat tcaatttatc gattttcag aca att gac 1174 

Thr lie Asp 
35 

eta aca gtg gac gac tac gat gaa aca gaa ata cag gag att ctg gat 1222 
Leu Thr Val Asp Asp Tyr Asp Glu Thr Glu lie Gin Glu lie Leu Asp 
40 -45 50 

aat gga aaa gca gaa gaa gga aca gat gaa gat tct gat tta gtt gaa 1270 
Asn Gly Lys Ala Glu Glu Gly Thr Asp Glu Asp Ser Asp Leu Val Glu 
55 60 65 

ggg att ctt aac get aat tea gat gtc caa gcg etc ctt gat gcg cca 1318 
Gly lie Leu Asn Ala Asn Ser Asp Val Gin Ala Leu Leu Asp Ala Pro 
70 75 80 

tct gag caa gta get caa get ctt aat teg ttc ttc gga aat gag agt 1366 
Ser Glu Gin Val Ala Gin Ala Leu Asn Ser Phe Phe Gly Asn Glu Ser 
85 90 95 



gaa caa gaa get gtt gca gca caa aga egg gtt gat gcg gag aag act 1414 
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Glu Gin Glu Ala Val Ala Ala Gin Arg Arg Val Asp Ala Glu Lys Thr 
100 105 HO H5 

gcc aaa gat gaa get gaa etc aag caa cag gaa gag gcg gttagattgc 1463 
Ala Lys Asp Glu Ala Glu Leu Lys Gin Gin Glu Glu Ala 
120 125 

aataaaggaa acaataataa aattatttta ttttcag gaa gat ctt att ata gaa 1518 

Glu Asp Leu lie lie Glu 
130 



gat teg ata gtc aaa act gat gaa gaa aaa caa gca gtt cga aga ctg 
Asp Ser He Val Lys Thr Asp Glu Glu Lys Gin Ala Val Arg Arg Leu 
135 140 145 150 

aaa ate aac gaa ttt tta teg tgg ttc aca agg etc ctt cca gaa caa 
Lys He Asn Glu Phe Leu Ser Trp Phe Thr Arg Leu Leu Pro Glu Gin 
155 160 165 

ttt aaa aat ttc gaa ttc aca aat ccg aac tat ctg aca gaa tct ate 
Phe Lys Asn Phe Glu Phe Thr Asn Pro Asn Tyr Leu Thr Glu Ser He 
170 175 180 

age gat tea ccg gtt gta aat gtc gat aaa tgc aag gaa att gtc aaa 
Ser Asp Ser Pro Val Val Asn Val Asp Lys Cys Lys Glu He Val Lys 
185 190 195 

teg ttc aag gaa agt gaa tea ctt gag gga ctt tea cag aaa tac gaa 
Ser Phe Lys Glu Ser Glu Ser Leu Glu Gly Leu Ser Gin Lys Tyr Glu 
200 205 210 

tta att gat gaa gac gtg eta gtc get get att tgt att ggc gtt etc 
Leu He Asp Glu Asp Val Leu Val Ala Ala He Cys He Gly Val Leu 
215 220 225 230 

gat acc aac aac gaa gaa gat gtc gac ttt aat gtt eta tgt gat gat 
Asp Thr Asn Asn Glu Glu Asp Val Asp Phe Asn Val Leu Cys Asp Asp 
235 240 245 

cgt ate gac gat tgg agt ata gaa aaa tgt gtc act ttt ctt gat tat 
Arg He Asp Asp Trp Ser He Glu Lys Cys Val Thr Phe Leu Asp Tyr 
250 255 260 

cca aat act gga ttg aat teg aaa aat gga ccg ttg aga ttc atg cag 
Pro Asn Thr Gly Leu Asn Ser Lys Asn Gly Pro Leu Arg Phe Met Gin 
265 270 275 

ttt act gtc aca tea cct gca tea gca att etc atg etc act ctg att 
Phe Thr Val Thr Ser Pro Ala Ser Ala He Leu Met Leu Thr Leu He 
280 285 290 

cga tta cgc gaa gaa ggg cat ccg tgt cga tta gat ttt gat tea aat 
Arg Leu Arg Glu Glu Gly His Pro Cys Arg Leu Asp Phe Asp Ser Asn 
295 300 305 310 

ccg act gat gat tta etc ttg aat ttc gat caa gtg gaa ttt tct aat 
Pro Thr Asp Asp Leu Leu Leu Asn Phe Asp Gin Val Glu Phe Ser Asn 
315 320 325 



1566 



1614 



1662 



1710 



1758 



1806 



1854 



1902 



1950 



1998 



2046 



2094 
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aat ate att gat acg gca gtc aaa tac tgg gat gat cag aag gaa aac 
Asn He He Asp Thr Ala Val Lys Tyr Trp Asp Asp Gin Lys Glu Asn 
330 335 ■ 340 



2142 



ggt gcg cag gat aaa att ggc agg cga gta tta ate aaa etc aca act 2190 
Gly Ala Gin Asp Lys lie Gly Arg Arg Val Leu He Lys Leu Thr Thr 
345 350 355 



gtt ttg aaa gtattttcat aattatcact taaatacctt ttagagagct 
Val Leu Lys 
360 

caacgacttc ttccacgaaa tcgagtcaac atcagcagaa ttcaaacaac attttgag 
aac gec gtt ggc age cgt aat gaa ata att caa ctt gtc aac gag aaa 
Asn Ala Val Gly Ser Arg Asn Glu He He Gin Leu Val Asn Glu Lys 
365 370 375 

att ccc gat ttt gat ggc act gag get get gtg aat gag agt ttt aca 
He Pro Asp Phe Asp Gly Thr Glu Ala Ala Val Asn Glu Ser Phe Thr 
380 ~ 385 390 

tec gat caa cga acc gaa att ate aac tct cgt gca ata atg gag aca 
Ser Asp Gin Arg Thr Glu He He Asn Ser Arg Ala He Met Glu Thr 
395 400 405 

tta aaa gec gag atg aag etc gee ate gee gaa get cag aaa gtt tac 
Leu Lys Ala Glu Met Lys Leu Ala lie Ala Glu Ala Gin Lys Val Tyr 
410 415 420 425 

gac acc aag act gac ttc gaa aaa ttc ttc gtt ttg aca gtt gga gat 
Asp Thr Lys Thr Asp Phe Glu Lys Phe Phe Val Leu Thr Val Gly Asp 
430 435 440 

ttc tgt ctg get cgc gee aat cct tct gac gat gca gaa tta aca tac 
Phe Cys Leu Ala Arg Ala Asn Pro Ser Asp Asp Ala Glu Leu Thr Tyr 
445 ~ 450 455 

gee ata gtt cag gat cgt gtg gat gca atg acc tat aag gtt aaa ttt 
Ala He Val Gin Asp Arg Val Asp Ala Met Thr Tyr Lys Val Lys Phe 
460 " * 465 470 

ate gac aca agt cag ate aga gag tgt aac ate aga gat tta gec atg 
He Asp Thr Ser Gin He Arg Glu Cys Asn He Arg Asp Leu Ala Met 
475 480 485 

act acg cag gga atg tat gac ccg agt ttg aat aca ttt ggt gat gtt 
Thr Thr Gin Gly Met Tyr Asp Pro Ser Leu Asn Thr Phe Gly Asp Val 
490 495 500 505 



2239 



2297 
2345 



2393 



2441 



2489 



2537 



2585 



2633 



2681 



2729 



g gtgagtttt 
taagggtttc 
ttaaaaaaat 
aattttgata 
ttccatgatt 
aagtttgaaa 
cacaaaaaaa 
gtcagaatcc 



a agttaaaatt gatatttaat attacatctg ttatgtagaa 
ggtttttcga ttttattaga aaatcgaaaa ttttagtttt tgtgttaaat 
caaaatttga ttcactatca agtccgtttt tctcttctca aaattgacaa 
atctagaatt ttcgtcccgt atatttttca acgaaaaacc atttaaaatt 
ggattttegg ttgatctaga aaaaaatggt gctaaacact aaatttgaaa 
caaattcaaa tccaaatatt tcatgaaaaa cttgtaaaat atattatgta 
cgtttcaagt gtagcagttg ttttttgtgg tcccaaaaaa gcagatgttt 
attaaacaac aaaaaaatcc aaaaactcaa cctggcctag atatcagttt 



2780 
2840 
2900 
2960 
3020 
3080 
3140 
3200 
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catgatcgaa gtatctaaaa tcattgtttt cag gt ctt cga gtt gcc tgt cgc 3253 

Gly Leu Arg Val Ala Cys Arg 
510 



caa gtt att tec teg age caa ttt gga aaa aaa aca att tgg ctt ace 
Gin Val He Ser Ser Ser Gin Phe Gly Lys Lys Thr He Trp Leu Thr 
515 520 525 



gat aga aat cat cag cat att gac gag aaa ate tat aga gga tct cat 
Asp Arg Asn His Gin His He Asp Glu Lys He Tyr Arg Gly Ser His 
690 695 700 



3301 



ggt aca get gcc gga cgt cgc aga get cat aga tec gat ttt eta att 3349 
Gly Thr Ala Ala Gly Arg Arg Arg Ala His Arg Ser Asp Phe Leu He 
530 535 540 

ttc ttc gac aac gga acc gat gca tac gtg tea get ccg aca atg cct 3397 
Phe Phe Asp Asn Gly Thr Asp Ala Tyr Val Ser Ala Pro Thr Met Pro 
545 550 555 560 

ggt gaa cca ggt tat gaa gtt get tct gaa aag aaa agt gta ttt tct 3445 
Gly Glu Pro Gly Tyr Glu Val Ala Ser Glu Lys Lys Ser Val Phe Ser 
565 570 575 

etc aaa gaa atg att gcg aag atg aat get get cag att get att atg 3493 
Leu Lys Glu Met He Ala Lys Met Asn Ala Ala Gin He Ala He Met 
580 585 590 

gtt gga cag cca gta gga aag gaa gga aat ctg gat tat ttt ttg aca 3541 
Val Gly Gin Pro Val Gly Lys Glu Gly Asn Leu Asp Tyr Phe Leu Thr 
595 * 600 605 

ttt cat tgg att cga caa tct cac aga tea gcg tat att egg gat ttt 3589 
Phe His Trp He Arg Gin Ser His Arg Ser Ala Tyr He Arg Asp Phe 
610 " 615 620 

atg aaa gaa ttt ccg gaa tgg cca ctt etc aag atg cca gtt gga atg 3637 
Met Lys Glu Phe Pro Glu Trp Pro Leu Leu Lys Met Pro Val Gly Met 
625 630 635 640 

cga ate tgt ttg tac aat tct ctt gtt gat cga cgt aag aaa atg gtg 3685 
Arg He Cys Leu Tyr Asn Ser Leu Val Asp Arg Arg Lys Lys Met Val 
645 650 655 

aca gtg att gga. act gat cga get ttt get att gtg aga cac gaa gca 3733 
Thr Val He Gly Thr Asp Arg Ala Phe Ala He Val Arg His 'Glu Ala 
660 665 670 

ccg aat cca ttg get cct ggg aat aga tgt aca gac ttt ccg tgc aat 3781 
Pro Asn Pro Leu Ala Pro Gly Asn Arg Cys Thr Asp Phe Pro Cys Asn 
675 680 685 



3829 



"aga ttg gaa ggc gca gcg gtaagatttt atttgaaaak' ttgatacaaa~ = • " 3877 
Arg Leu Glu Gly Ala Ala 
705 710 

acgaggattt tctaaaatta ttttattttt atttgatttg atttcttata attgataatc 3937 
aaggtttttt ggatgttttg ttagagaaat cgaaaaggga aacttccaaa aaaaagctgt 3997 
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gaaatcaatt tttgctttta ataatatcca agtttcatct tcaaagtttt ttctataaaa 4057 
tggacacaaa cttttcaacg ttttcaaaaa aaaggttccg aaaatatgaa aaaaggagaa 4117 
agaaatcatg aaaattttgt attatttcag cac aag aag cac atg ate teg aca 4171 

His Lys Lys His Met lie Ser Thr 
715 

aat aac aat ctg teg caa cgc aga aaa gac cag ctt caa tea cag ttc 4219 
Asn Asn Asn Leu Ser Gin Arg Arg Lys Asp Gin Leu Gin Ser Gin Phe 
720 725 730 

gag cca ace gac atg att cgt teg atg cca gag agg aat cac caa caa 4267 
Glu Pro Thr Asp Met lie Arg Ser Met Pro Glu Arg Asn His Gin Gin 
735 740 ~ 745 750 

gtc gtt aaa aag aaa acg acg ggc acc aat cag aat gtc get teg aca 4315 
Val Val Lys Lys Lys Thr Thr Gly Thr Asn Gin Asn Val Ala Ser Thr 
755 760 765 

aat gat gca aaa teg aag aga gaa att gaa ata aga aag aaa aat caa 4363 
Asn Asp Ala Lys Ser Lys Arg Glu He Glu He Arg Lys Lys Asn Gin 
770 775 780 

ttc tta ttt aac aag att att gtt cca ata ccc gtc eta aca cca ttg 4411 
Phe Leu Phe Asn Lys He He Val Pro He Pro Val Leu Thr Pro Leu 
785 * 790 795 

gaa aat etc aag get cat get caa tgt ggt cca gat tgt eta cag aaa 4459 
Glu Asn Leu Lys Ala His Ala Gin Cys Gly Pro Asp Cys Leu Gin Lys 
800 805 810 

atg gat gcg gat ccg tat gaa gca aga ttc cat cga aat tea cca ata 4507 
Met Asp Ala Asp Pro Tyr Glu Ala Arg Phe His Arg Asn Ser Pro He 
815 * 820 825 830 

cat act cct ctt ttg tgt ggt tgg aga cga att atg tac aca atg agt 4555 
His Thr Pro Leu Leu Cys Gly Trp Arg Arg He Met Tyr Thr Met Ser 
835 840 845 

act gga aag aag egg gga gca gtg aag aaa aac att att tac ttt tct 4603 
Thr Gly Lys Lys Arg Gly Ala Val Lys Lys Asn He He Tyr Phe Ser 
850 855 860 

cca tgc gga gee get ctt cac cag ate age gac gtc tct gaa tat att 4651 
Pro Cys Gly Ala Ala Leu His Gin He Ser Asp Val Ser Glu Tyr He 
865 870 875 

cat gtc acc aga agt tta ttg acg att gat tgt ttt tea ttt gat gca 4699 
His Val Thr Arg Ser Leu Leu Thr He Asp Cys Phe Ser Phe Asp Ala 
880 885 890 



cga ate gat act gee act tat att act gtt gac gat aaa tat ttg aag 
Arg ..He Asp Thr Ala Thr Tyr He Thr Val Asp Asp Lys Tyr Leu Lys 
895 • ■ "900 "•" 905 * 910 



4747 



gtt get gat ttt teg ctt gga acc gaa gga ate cca att cca eta gtg 4795 
Val Ala Asp Phe Ser Leu Gly Thr Glu Gly He Pro He Pro Leu Val 
915 920 925 



-68- 



WO 2004/024084 



PCT/US2003/028626 



aac age gtg gat aac gat gag cct cca tea ttg gaa tat teg aaa cga 4843 
Asn Ser Val Asp Asn Asp Glu Pro Pro Ser Leu Glu Tyr Ser Lys Arg 
930 935 940 

caa ttc caa tac aat gat caa gtg gat ata teg agt gtt age cga gat 
Arq Phe Gin Tyr Asn Asp Gin Val Asp lie Ser Ser Val Ser Arg Asp 
945 950 955 

ttc tat tct gga tgc tct tgt gat ggt gat tgc agt gac gca teg aag 
Phe Cys Ser Gly Cys Ser Cys Asp Gly Asp Cys Ser Asp Ala Ser Lys 
960 965 970 

tgt gaa tgc caa caa ttg tec att gaa gca atg aaa cga etc ccc cat 4987 
Cys Glu Cys Gin Gin Leu Ser He Glu Ala Met Lys Arg Leu Pro His 
975 980 985 990 



4891 



4939 



aat tta caa ttc gac gga cac gac gaa ttg tat gag agt tea gaa aaa 
Asn Leu Gin Phe Asp Gly His Asp Glu Leu Tyr Glu Ser Ser Glu Lys 
995 1000 1005 

caa aat aaa ttt tta aaa eta ttt ttt ttc aga gtt cct cac tat caa 
Gin Asn Lys Phe Leu Lys Leu Phe Phe Phe Arg Val Pro His Tyr Gin 
1010 " 1015 1020 

aat cgt ctt etc age agt aag gtt ate agt gga etc tat gaa tgc aac 
Asn Arg Leu Leu Ser Ser Lys Val He Ser Gly Leu Tyr Glu Cys Asn 
1025 1030 1035 

gat cag tgt tea tgc cat cga aag tct tgt tac aac aga gtt gtt cag 
Asp Gin Cys Ser Cys His Arg Lys Ser Cys Tyr Asn Arg Val Val Gin 
1040 1045 1050 

aac aat ate aag tat cct atg cat gtg agt tta ttt aac gat gat aca 
Asn Asn He Lys Tyr Pro Met His Val Ser Leu Phe Asn Asp Asp Thr 
1055 1060 1065 1070 

tac caa tta ttg ttt ttt ctt cag ate ttc aaa act get caa tec gga 
Tyr Gin Leu Leu Phe Phe Leu Gin lie Phe Lys Thr Ala Gin Ser Gly 
1075 1080 1085 

tgg gga gtc cga get ttg acg gat att cct caa agt acg ttc att tgc 
Tro Gly Val Arg Ala Leu Thr Asp lie Pro Gin Ser Thr Phe He Cys 
V 1090 1095 1100 

acg tat gta ggt get ata ctg acg gat gat ttg get gat gaa eta aga 
Thr Tyr Val Gly Ala He Leu Thr Asp Asp Leu Ala Asp Glu Leu Arg 
1105 1110 1H5 

aat gcg gat caa tac ttc get gat ttg gac ttg aag gat acc gtg gag 
Asn Ala Asp Gin Tyr Phe Ala Asp Leu Asp Leu Lys Asp Thr Val Glu 
1120 * H25 1130 

ctq gaa aag ggt ege gaa gat cat gaa; act gat ttt ggt tac gga gga 
Leu Glu Lys Gly Arg Glu Asp His Glu Thr Asp Phe Gly Tyr Gly Gly 
H35 1140 I* 45 1150 

gac gag tea gat tat gat gac gaa gaa gga agt gat ggt gac tec ggt 
Asp Glu Ser Asp Tyr Asp Asp Glu Glu Gly Ser Asp Gly Asp Ser Gly 



5035 



5083 



5131 



5179 



5227 



5275 



5323 



5371 



5419 



5467 



5515 
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1155 1160 H65 

gat gat gta atg aac aaa atg gtg aaa cgt caa gac tct teg gag agt 5563 
Asp Asp Val Met Asn Lys Met Val Lys Arg Gin Asp Ser Ser Glu Ser 
1170 H75 H80 

ggt gaa gaa aca aaa egg ctg aca aga cag aaa aga aag caa tct aaa 5611 
Gly Glu Glu Thr Lys Arg Leu Thr Arg Gin Lys Arg Lys Gin Ser Lys 
1185 H90 H95 

aaa tec ggt aaa gga gga agt gtg gag aaa gat gac acc act cca aga 5659 
Lys Ser Gly Lys Gly Gly Ser Val Glu Lys Asp Asp Thr Thr Pro Arg 
1200 1205 1210 

gat tea atg gaa aag gat aat att gaa agt aaa gac gaa ccc gtt ttc 5707 
Asp Ser Met Glu Lys Asp Asn He Glu Ser Lys Asp Glu Pro Val Phe 
1215 1220 1225 1230 

aat tgg gat aag tat ttt gag ccg ttt cca ttg tat gtt ata gat gca 5755 
Asn Trp Asp Lys Tyr Phe Glu Pro Phe Pro Leu Tyr Val He Asp Ala 
1235 1240 1245 

aaa cag aga gga aat ctt gga ag gtaagatcac aattttattc attaaaaaaa 5808 
Lys Gin Arg Gly Asn Leu Gly Arg 
1250 

ttttttagag attttgettt aaatgataaa aaatggacaa accaaccgtt tgcctcttct 5868 
tttggtttat caacctttct ctatggaaaa aattctgaaa aattaacaaa cagtatttca 5928 
cgttgaaaag tgaagaaaaa agcaaaaaaa ggaaacaaat ttcaaaaegg ttctactcca 5988 
tcttaaaaaa actaaaattc gtaaaaagtc atttggtatg ttttggagac tataatacaa 6048 
ttgagaaaat ttgaaaaacc ggcactccaa agatacaatc ataaattttc gataactttc 6108 
ag a ttc ttg aat cac tct tgc gat ccg aat gtg cac gtt caa cac gtc 6156 

Phe Leu Asn His Ser Cys Asp Pro Asn Val His Val Gin His Val 

1255 1260 1265 

atg tac gat acg cat gat ctt cgt ctt cca tgg gtc gcg ttt ttc aca 62 04 
Met Tyr Asp Thr His Asp Leu Arg Leu Pro Trp Val Ala Phe Phe Thr 
1270 1275 1280 1285 

cga aaa tac gtg aaa gec ggc gat gag eta acc tgg gac tat caa tat 
Arg Lys Tyr Val Lys Ala Gly Asp Glu Leu Thr Trp Asp Tyr Gin Tyr 
1290 1295 1300 

act caa gat cag acg get acc aca caa etc aca tgc cac tgc gga get 
Thr Gin Asp Gin Thr Ala Thr Thr Gin Leu Thr Cys His Cys Gly Ala 
1305 1310 1315 

gaa aac tgc acc ggc cgt ttg ctg aaa agt taa agaattgttg ttatttcctt 6353 
Glu Asn Cys Thr Gly Arg Leu Leu Lys Ser * 
1320 1325 

- cccagttatg ttttcctttt tttttaagta tttatttatt tatttaattt ttattttgtt 6413 
" tattgttcaa tcgtttaaaa tctccctttg aaaacagcat ctcatatgta tgatct'aaac 6473 
aegtatttae ctegtaaggg tttgccaaat agtttctttg gttttcattt tgattttctc 6533 
tgcgaataaa atgttttaaa aaagacatta tattttttaa tagtcagtac agttttgatg 6593 
tctccaatct atttcagttt acaattttaa aatatagaat atatatattt aggtttcata 6653 
agttatgcat egattaeggg ttctaegtea cttgaagttc tgcatttcca cgtcacatag 6713 
gactactgta gttttaaaaa atactegtte attttgtaat aatattcctt ctactagttt 6773 
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tgcttctggt aataatcgaa tttcaaaact 
gcagcaaaat atgaaaagaa aagtccaact 
attggaggtg tcgccatgcg agaagatcaa 
aaaatgattg caaaagaagt cacttatcca 
ctagtgaatg cggctgataa taaagcaaga 
ttagataggt aaatatattg caggaattta 
tcgccacgac aattgttttg gtaaatgcat 
gtagttttaa attctcgttt cttcaatttt 
tttaaaaaaa tctaaaattc acattaattt 
tttggcgata cagtactatc 

<210> 27 
<211> 1327 
<212> PRT 

<213> Caenorhabditis elegans 
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ttagctaaaa tatttctttt tgaagaggct 6833 
gaacatgtat tacttcgacc cgatacatat 6893 
attatttggc tcagagactc agaaaataga 6953 
cctggattat tgaagatttt cgatgagatt 7013 
gattccagta tgaatcggtt ggaagtatgg 7073 
tgttctgcga caaagctacg atacgctgtc 7133 
gaaaatcgac gtgcaccttt aaataatact 7193 
tcataaatgg ttttccgatg aatatatgat 7253 
ataagaaaca aaattcctca aaaacgaaag 7313 
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1105 " j 1110 1115 ~ 1120 
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1250 1255 1260 

His Val Gin His Val Met Tyr Asp Thr His Asp Leu Arg Leu Pro Trp 
1265 1270 1275 1280 

Val Ala Phe Phe Thr Arg Lys Tyr Val Lys Ala Gly Asp Glu Leu Thr 

1285 1290 1295 

Trp Asp Tyr Gin Tyr Thr Gin Asp Gin Thr Ala Thr Thr Gin Leu Thr 
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ttattttctt 
tgattttttt 
tttcggaaaa 
ttttctggga 
atttttttgg 
cgaattctac 
aaaatcaaaa 
atttttctcg 
gatttttccc 
tgattttttg 
tttcgaaaaa 
tttgcacaaa 
tttgtttttg 
gaatatttcc 
actttttaaa 



ccccggttgt 
gagttttgcc 
tcgctagttt 
ttgaattgtt 
ttttgtacaa 
ttttttcccc 
tctcgatttt 
aaccgcattt 
aaaaaagcaa 
gagttttccc 
ttgctagttt 
ttgtttgaaa 
atttttgaat 
atttttttcg 
tgaaaaatcg 
tccttttttt 



ttaggaaata 60 
ggttttcagc 120 
tcccctcaat 180 
tgcaaaaaaa 240 
atttttgaat 300 
aaaattttcc 360 
tttacgattt 420 
ttttttctga 480 
gttattcccc 540 
cagttctcag 600 
tcccttcaat 660 
aaaatcaaga 720 
tttttcgtaa 780 
agattttccc 840 
gctatttcta 900 
tttgccattt 960 
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tttcccatct aaaattctaa attattcaaa attttacaga atg tea gaa gta ate 1015 

Met Ser Glu Val He 
1 5 

gac gaa agt ate tta aat aca gaa get tea gat gat cca ata cct cca 1063 
Asp Glu Ser He Leu Asn Thr Glu Ala Ser Asp Asp Pro He Pro Pro 
10. 15 20 

tta aat gat gat cag att get gag ctt ttg ggt gaa gat gga gaa att 1111 
Leu Asn Asp Asp Gin He Ala Glu Leu Leu Gly Glu Asp Gly Glu He 
25 30 35 

atg gag ata act gag cag aaa g gtgagatttt ttgagtaaaa ccttgaattt 1163 
Met Glu He Thr Glu Gin Lys 
40 

tgcactaaaa atttgeaatt ttcgctaaaa attaccttaa aactcgaaaa ttggaatttc 1223 
tagctgagaa aatggccaaa aatgtcgaaa aatgcctccg aaacctgtga aaaaaaaaac 1283 
caccaaaaag gtttctaggc caccaaaaag atttctaggc caccaaaaat gtttctaggc 1343 
-caccaaaaat gtttctaggc caccaaaaat gtttctaggc caccaaaaat gtttctaggc 1403 
caccaaaaat gtttctaggc caccaaacag gtttcaatgc caccaaaaat gtttctaggc 1463 
caccaaaaat gtttctaggc ccccaaaaaa tttttctagg ccaccaaaaa ggtttctagg 1523 
ccaccaaaaa tgtttctagg ccaccaaaaa ggtttctagg ccaccaaaca ggtttcaatg 1583 
ccaccaaaaa ggtttctagg ccaccaacca ggtttcaatg ccaccaaaaa tgtttctagg 1643 
ccaccaaaaa ggtttctagg ccaccaaaaa tgtttctagg ccaccaaaaa tgtttctagg 1703 
ccaccaaaaa ggtttctagg ccaccaaaca ggtttcaatg ccaccaaaaa tgtttctagg 1763 
ccaccaaaca ggtttcaatg ccaccaaaaa ggtttctagg ccaccaaaaa ggtttctagg 1823 
ccaccaaaaa tgtttctagg ccaccaaaaa ggtttctagg ccaccaaaca ggtttcaatg 1883 
ccaccaaaaa tgtttctagg ccaccaaaca ggtttcaatg ccaccaaaaa tgtttctagg 1943 
ccaccaaaca ggtttcaatg ccaccaaaaa tgtttctagg ccaccaaaaa ggtttctagg 2003 
ccaccaaaaa tgtttctagg ccaccaaaaa tgtttctagg ccaccaaaaa ggtttctagg 2063 
ccaccaaaca ggtttcaatg ccaccaaaaa tgtttctagg ccaccaaaca ggtttcaatg 2123 
ccaccaaaaa tgtttctagg ccaccaaaaa tgtttctagg cccccaaaaa atttttctag 2183 
gccaccaaaa aggtttctag gccaccaaaa atgtttctag gccaccaaaa aggtttctag 2243 
gccaccaaac aggtttcaat gccaccaaaa aggtttctag gccaccaacc aggtttcaat 2303 
gccaccaaaa atgtttctag gccaccaaaa aggtttctag gccaccaaaa atgtttctag 2363 
gccaccaaaa atgtttctag gccaccaaaa aggtttctag gccaccaaaa aggtttcaag 2423 
gccaccaaaa aggtttcaat gccaccaaaa atgtttctag gccaccaaac aggtttcaat 2483 
gccaccaaaa aggtttctag gccaccaaaa atgtttctag accaccaaaa aggtttctag 2543 
gccaccaaac aggtttcaat gccaccaaaa aggtttctag gccaccaaac aggtttcaat 2603 
gccaccaaaa atgtttctag gccaccaaaa aggtttctag gccaccaaaa atgtttctag 2663 
gccaccaaaa atgtttctag gccaccaaaa aggtttctag gccaccaaac aggtttcaat 2723 
gccaccaaaa atgtttctag gccaccaaac aggtttcaat gcccccaaaa aatttttcta 2 783 
ggccaccaaa aaggtttcta ggccatcaaa aatgtttcta gaccaccaaa aaggtttcta 2843 
ggccaccaaa aatgtttcta gaccaccaaa aaggtttcta ggccaccaaa aatgtttcta 2903 
ggccaccaaa aaggtttcta ggccaccaaa aatgtttcta ggccaccaaa aaggtttcta 2963 
ggccaccaaa caggtttcaa tgccaccaaa aaggtttcta ggccaccaac caggtttcaa 3023 
tgccaccaaa aatgtttcta ggccaccaaa aaggtttcta ggccaccaaa aatgtttcta 3083 
ggccaccaaa aatgtttcta ggccaccaaa aaggtttcta ggccaccaaa aaggtttcaa 3143 
ggccaccaaa aaggtttcaa tgccaccaaa aatgtttcta ggccaccaaa caggtttcaa 3203 
tgccaccaaa aaggtttcta ggccaccaaa caggtttcaa tgccaccaaa aaggtttcta 3263 
gaccaccaaa aaggtttcta ggccaccaaa caggtttcaa tgccaccaaa aaggtttcta 3323 
ggccaccaaa caggtttcaa tgccaccaaa aatgtttcta ggccaccaaa aaggtttcta 3383 
ggccaccaaa aatgtttcta ggccaccaaa aatgtttcta ggccaccaaa aaggtttcta 3443 
ggccaccaaa caggtttcaa tgccaccaaa aatgtttcta ggccaccaaa caggtttcaa 3503 
tgcccccaaa aaatttttct aggccaccaa aaaggtttct aggccaccaa aaatgtttct 3563 
agaccaccaa aaaggtttct aggccaccaa aaatgtttct agaccaccaa aaaggtttct 3623 
aggccaccaa aaatgtttct aggccaccaa aaaggtttct aggccaccaa acaggtttca 3683 
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atgccaccaa aaatgtttct aggccaccaa aaatgtttct aggcccccaa aaaatttttc 3743 

taggccacca aaaaggtttc aatgccacca aaaatgtttc taggccacca aaaaggtttc 3803 

taggccacca aaaatgtttc taggccacca aaaatgtttc taggccacca aaaaggtttc 3863 

taggccacca aacaggtttc aatgccacca aaaatgtttc taggccacca aacaggtttc 3923 

aatgccacca aaaaggtttc taggccacca aaaatgtttc tagaccacca aaaaggtttc 3983 

taggccacca aacaggtttc aatgccacca aaaaggtttc taggccacca aacaggtttc 4043 

aatgccacca aaaatgtttc taggccacca aaaaggtttc taggccacca aaaatgtttc 4103 

taggccacca aaaatgtttc taggccacca aaaaggtttc taggccacca aacaggtttc 4163 

aatgccacca aaaatgtttc taggccacca aacaggtttc aatgccacca aaaatgtttc 4223 

taggccacca aaaatgtttc taggccccca aaaaattttt ctaggccacc aaaaaggttt 4283 

ctaggccacc aaaaatgttt ctagaccacc aaaaaggttt ctaggccacc aaaaatgttt 4343 

ctagaccacc aaaaaggttt ctaggccacc aaaaatgttt ctaggccacc aaaaaggttt 4403 

ctaggccacc aaaaatgctt ctaggccacc aaaaatgttt ctacgccacc aaaagccgcc 4463 

tcaagcccga aaaatttgaa tttcccgctc aaaaaatcta aaattttccg attttcag 4521 

ac gaa tea gat gat gtg gtg atg ctg gac gac gat gat gac gac act 4568 
Asp Glu Ser Asp Asp Val Val Met Leu Asp Asp Asp Asp Asp Asp Thr 

45 50 55 60 

ccg gaa ccg att etc gtg att gat atg gat gag gat gag gat gtt act 4616 
Pro Glu Pro He Leu Val He Asp Met Asp Glu Asp Glu Asp Val Thr 
65 70 75 

aca gat ggt cct gaa tct cag gaa gag ctg get gca gat get ccg get 4664 
Thr Asp Gly Pro Glu Ser Gin Glu Glu Leu Ala Ala Asp Ala Pro Ala 
80 85 90 

cca gga get cca gaa get tea get cca get caa gaa gee tea gaa get 4712 
Pro Gly Ala Pro Glu Ala Ser Ala Pro Ala Gin Glu Ala Ser Glu Ala 
95 100 105 

tea get ccg gat caa gaa get cca gaa gtt cag gat gtt ccg gat tct 4760 
Ser Ala Pro Asp Gin Glu Ala Pro Glu Val Gin Asp Val Pro Asp Ser 
110 115 120 

teg gga get cca gat get tea get cag get tea gag get tct gat get 4808 
Ser Gly Ala Pro Asp Ala Ser Ala Gin Ala Ser Glu Ala Ser Asp Ala 
125 ~ 130 135 140 

tea get cca gaa gtt cca gga tct aca gaa get cag gat get cag gat 4856 
Ser Ala Pro Glu Val Pro Gly Ser Thr Glu Ala Gin Asp Ala Gin Asp 
145 150 155 

gtt ccg gat tct ttg gga get tea gat get tea get caa gaa att cca 4904 
Val Pro Asp Ser Leu Gly Ala Ser Asp Ala Ser Ala Gin Glu He Pro 
160 165 170 

gaa get cca gaa gee cca gaa get cca gaa ate gee get gaa ate gac 4 952 
Glu Ala Pro Glu Ala Pro Glu Ala Pro Glu He Ala Ala Glu He Asp 
175 180 185 

gaa gaa gtg ctg etc gee gag caa aat gga gtt ttg gac gaa gga ttt 5000 
Glu Glu Val Leu Leu Ala Glu Gin Asn Gly- Val Leu Asp Glu Gly Phe 
•190 195 200 • 

gat gag act gac gat att ate ata gaa gaa gaa get gta gaa gaa get 5048 
Asp Glu Thr Asp Asp He He He Glu Glu Glu Ala Val Glu Glu Ala 
205 " 210 215 220 
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gaa gcc gtg gag cca cca att aac act gaa aat cag gaa aac gcg ctg 5096 
Glu Ala Val Glu Pro Pro He Asn Thr Glu Asn Gin Glu Asn Ala Leu 
225 230 235 



gaa atg etc gaa gag cgc etc aag aag aat gaa gaa aag gaa att gtg 
Glu Met Leu Glu Glu Arg Leu Lys Lys Asn Glu Glu Lys Glu He Val 
240 245 250 



5144 



gag aaa agt gat gtg aag cca gag gat gaa gat att ata cat atg gag 
Glu Lys Ser Asp Val Lys Pro Glu Asp Glu Asp He He His Met Glu 
255 ~ 260 265 



5192 



acg gat tea gtt gaa a gtatgggctt ttttagctgg aaaacaggaa aaaagagcaa 5248 
Thr Asp Ser Val Glu 
270 

aaaattgata catttccagc ttaaccaatc tttttttgag ttgtaaagcc tgaaaattga 5308 

gatttttgta ccaactttta tgataaagct gaaaaaaaaa ttaatttttt gacgaatttt 5368 

tageggaaac cctgaaaaca tgttttgtct gaaaaataca gaaaatcgtc actttttaca 5428 

ataaattcga gatttttagc tcaaaaatac aacattatag tgeaaaaate tcagaaaaag 5488 

ccaaaaattt cattcaaaca tctcaaaaaa agcagaaatt ttactcaaaa tatctcagaa 5548 

aaagctaaaa ttttcccaaa aaatcccaga aaaagcagaa ttttcattca aaattcccag 5608 

aaaaagctga taatttacta aacaatctca gaaaatgctg aaattttact caaaagtctt 5668 

cataaaaagc tgaaatttta ctttaaaagt ttaggaaatg ctgeaattte acttaaaaat 5728 

cccaaaaaag ctaaaatttt cccaaaaaat cccagaaaaa gcagaaattt tactcgaata 5788 

tctcaaaaaa aaaaaagctg aaatttcact caaaaatccc agaaaaagct aaaaatttac 5848 

taaaaaatct caaaaaaaaa aacgetaaaa tttcactcaa aaatctcaga aaaagctaaa 5908 

attttactcg aatatctcaa aaaaaaaaac tgaaattttc ctaaaaaatt tatgaaaaac 5968 

cgaaatttca cttaaaagtc tcataaaaag ccgaattttc ccaaaaaaat cccagaaaaa 6028 

gctaaaaatt tactttaaaa tctcatctgt aattttagtt taaaatctca gaaaaacccg 6088 
aaatttctct caaaaatttg ctgattttca aattttcag eg tea age cgc aaa cgt 6144 

Thr Ser Ser Arg Lys Arg 
275 



act ggc gga gcc aca agt ccg egg age ccg get caa aaa cga cca aaa 
Thr Gly Gly Ala Thr Ser Pro Arg Ser Pro Ala Gin Lys Arg Pro Lys 
280 ~ " 285 290 295 



6192 



cga cgt gtt caa acg tta tta aag atg cgt cag aat gca att gaa eta 
Arg Arg Val Gin Thr Leu Leu Lys Met Arg Gin Asn Ala He Glu Leu 
300 305 310 



6240 



ttg aca cga ctt tat ggc tea tgg gat gca caa ttg age etc tea aat 6288 
Leu Thr Arg Leu Tyr Gly Ser Trp Asp Ala Gin Leu Ser Leu Ser Asn 
315 320 325 



ctt gag aca att cga ttg ttg ggt gtc aat aat aat agg aag ctt ate 
Leu Glu Thr He Arg Leu Leu Gly Val Asn Asn Asn Arg Lys Leu He 
330 335 340 



6336 



gaa att ttt gag gag aat gag caa g gttaaagcgt ttttaaatgc 
Glu -He Phe Glu Glu Asn- Glu Gin 
345 350 



6381 



tatgaaaact gacaaatttt cgataaaaaa aeggattttt ggaagaaaat cgcctgaaaa 6441 

ttcatgtttt tetgeaaatt ttgaccaaat tcccaagaaa aatacgattt tttagtccga 6501 

aaatcctcca aaaagatttc taggccacca aaaaggtttc taggccacca agaaagtttc 6561 

taggccacca aagtatttat aggccaccta agatgtttct aggccacctg agatgtttct 6621 
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aggtcaccaa aaatgtttct cggtcaccaa aaatgtttca aggccaccga aaaggtttct 6681 
aggccaccta agtatttcta ggccacctaa gatgtttcta ggccacctga gatgtttcta 6741 
ggtcaccaaa aatgtttcta ggttaccaaa aatgtttcaa ggccatcgaa aaggtttcta 6801 
ggccaccaaa gtatttctag gccacctaag atgtttctag gccacctgag atgtttctag 6861 
gtcaccaaaa atgtttcaag gccaccgaaa aggtttctag gccaccaaaa aggtttctag 6921 
gccaccaaaa atatttctag gccacctaag atgtttctag gccacctgag atgtttctag 6981 
gccacctgag atgtttctag gccacctgag atgtttctag gtcaccaaaa atgtttctcg 7041 
gtcaccaaaa atgtttcaag gccaccgaaa aggtttctag gccacctaag tatttctagg 7101 
ccacctaaga tgtttctagg ccacctgaga tgtttctagg tcaccaaaaa tgtttctagg 7161 
ttaccaaaaa tgtttcaagg ccatcgaaaa- ggtttctagg ccaccaaagt atttctaggc 7221 
cacctaagat gtttctaggc cacctgagat gtttctaggt caccaaaaat gtttcaaggc 7281 
caccgaaaag gtttctaggc caccaaaaag gtttctaggc caccaaaaat atttctaggc 7341 
caccaaaaat gtttctaggt caccaaaaat gtttctaggt caccaaaaat gtatcaaggc 7401 
caccaaaaag gtttctaggt caccaaaaat gtttctaggc caccaaaaat gtttctaggt 7461 
caccaaaaat gtttctaggc caccaaaaag gtttctaggc caccaaaaag gtttctaggc 7521 
caccaaaaag gtttctaggc caccaaaaag gtttcaaggc caccaaaaag gtttctaggc 7581 
caccaaaaat gtttctaggt caccaaaaat gtttctaggc caccaaagta tttctaggcc 7641 
acctaaaagg tttctaggcc atcaaaaagg tttctaggcc atcaaaaagg attctaggcc 7701 
accaaaaata tttctaggcc acctaagatg tttctaggcc accagagtat ttctaggcca 7761 
.cctaagaggt ttctgggcca tcaaaaaggt ttcaagtcca tcaaaaaggt ttctaggcca 7821 
ccaaaaaggt ttctaggcca ccgaaaaggt ttctaggcca ccaaaaaggt ttctagacca 7881 
cctaagacat ttctaggcca acaaaaaggt ttctaggcca ccaagaagcc gaaaaactgt 7941 
ctcaaattcg aattttgcag tg etc aaa caa aaa gtg tec gca ctg aca gaa 7993 

Val Leu Lys Gin Lys Val Ser Ala Leu Thr Glu 
355 360 

gag ctg aaa aag gag aag ctg get cac gcg gga acc cgt tea gca ttg 8041 
Glu Leu Lys Lys Glu Lys Leu Ala His Ala Gly Thr Arg Ser Ala Leu 
365 370 375 

aaa gaa ttg act aat gaa ata act gga atg cgt gta caa atg aat aaa 8089 
Lys Glu Leu Thr Asn Glu lie Thr Gly Met Arg Val Gin Met Asn Lys 
380 385 390 

eta cgt tea atg gtc act cag cct acg act teg aaa att att gat agt 8137 
Leu Arg Ser Met Val Thr Gin Pro Thr Thr Ser Lys lie lie Asp Ser 
395 400 405 " 410 

ttt gtt caa cgt cat cag get ttc gag cag caa caa caa ttc caa cac 8185 
Phe Val Gin Arg His Gin Ala Phe Glu Gin Gin Gin Gin Phe Gin His 
415 420 425 



caa cac cac caa cac cga cca ata atg ttg get cca cgt cat cat ccg 
Gin His His Gin His Arg Pro He Met Leu Ala Pro Arg His His Pro 
430 435 440 



8233 



ccg ccg ccc ccg cat ttt aca ccg aat caa egg gcg gcg get ccg tat 
Pro Pro Pro Pro His Phe Thr Pro Asn Gin Arg Ala Ala Ala Pro Tyr 
445 450 . 455 



8281 



cat ccg aat atg gtt caa ccg aat cgt ctt get get atg cca cat aga 
His Pro Asn Met Val Gin Pro Asn Arg. Leu Ala Ala Met Pro His Arg 
460 465 470 



8329 



aga ccg att att gga atg cag gtgaaaatgg aatgccatga aaatttcggg 
Arg Pro He He Gly Met Gin 
475 480 



8380 
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ccggaaaatt ttggaaaatc ctctaaattt tcaatatttg tcgaaaaaat ctgacaaaaa 8440 
tcgtgtcaaa attcagattt ccgggagaaa aatcgcattt ttgagtaaaa attcgaagaa 8500 
aagcgtctta aattctagat ttattagtta aaattttttt caaattttag tcaagaaaat 8560 
taagaaaaat gcgaaaattt cgagcaaaaa atatagtttt ttggagccga aattgtgaaa 8620 
aatgcgattt ttttcgaaaa atctggacaa aaaatttcaa acaagaaaaa ccactttttt 8680 
aaaaaaattt tcacacaatt tccag caa caa aat teg get cca cca caa ttc 8732 

Gin Gin Asn Ser Ala Pro Pro Gin Phe 
485 490 

aac ggt cac caa get etc gtc cca tea cct caa tea tea tct gca ttt 8780 
Asn Gly His Gin Ala Leu Val Pro Ser Pro Gin Ser Ser Ser Ala Phe 
495 500 505 

tct cgt cca cca cca act caa ctt gca aca cag aga aga get cca cca 8828 
Ser Arg Pro Pro Pro Thr Gin Leu Ala Thr Gin Arg Arg Ala Pro Pro 
510 515 ~ 520 

ttg gca agt ace ggc ctt ccg gca aca gtc aga tgg gaa gca att cca 8876 
Leu Ala Ser Thr Gly Leu Pro Ala Thr Val Arg Trp Glu Ala lie Pro 
525 530 535 

ccg cca aaa aat ccg aat gtc ggg cac aat gag cca ccg ctt aac aat 8924 
Pro Pro Lys Asn Pro Asn Val Gly His Asn Glu Pro Pro Leu Asn Asn 
540 545 550 

99 a 9 gttcgtcgtg tgcaacaaaa agagcaccgc ttttccacga cgagtttttg 8978 

Gly 

555 



cgatgatgat tttggtgtga aaattgaaaa actcattttt ttaaagtctg aaatttgaaa 9038 
atttgagaaa agttttttaa aaaaagtttt atgagggatt ttctgacaat tttttataaa 9098 
eggaaaatta cgaaaactcc aaaatttgtg ttctttcgga aaacgaattt gaaatttgaa 9158 
ccaaaatttt gacaattttc tggggatttt tgactggaaa ttcgtttttc atcgattttt 9218 
cctcctttaa ttttcggtaa aacccctgtc tccaattcca g gc cgt gca cag cca 9273 

Gly Arg Ala Gin Pro 
560 

eta ate gat aat aca cgt gta cac gac aat aca att atg ctg tgt gta 9321 
Leu He Asp Asn Thr Arg Val His Asp Asn Thr He Met Leu Cys Val 
565 570 575 

cca ctt gtc tec act gca aat aca ata tea teg ggc gat teg aca cgt 9369 
Pro Leu Val Ser Thr Ala Asn Thr He Ser Ser Gly Asp Ser Thr Arg 
580 585 590 

eta cca aaa gta cca cga ate tac gag aat etc acg gca aat ccc gat 9417 
Leu Pro Lys Val Pro Arg He Tyr Glu Asn Leu Thr Ala Asn Pro Asp 
595 600 605 

ttg agt gtg acg att cat teg agt gca cag gat ttc cga gag aat tat 9465 
Leu Ser Val Thr He His Ser Ser Ala Gin Asp Phe Arg Glu Asn Tyr 
. ,610 615 620 

caa att ggt gga aag att aac tat gaa tat etc gga gga ttt gat caa 9513 
Gin He Gly Gly Lys He Asn Tyr Glu Tyr Leu Gly Gly Phe Asp Gin 
625 630 635 640 



tat gtaggtgatg atgttttttt attgagagat aaatacgaaa ttccattaca 9566 
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Tyr 



atcgatattt tttgactgaa aaatgtctga aaaatcaaaa attttagcta aaaattgaga 9626 
atatttttgt ttaaaaaaaa tcattgaaat tgattttttt ttattccata aaaatctcgg 9686 
aaaagtcaat tttcagtcat aaatcttctg aaaattatcc aaacaatggg attttctgaa 9746 
attttagctt aaaaattgag gatttcccgg ttttttcaga gaaattccat tacaatcgat 9806 
ttttttactg aaaaatcctc tggaaattaa caaaaaccaa ataaaatgcc ctaatttttt 9866 
tttaaatcca aaaattgttg gattttttca gaaaaaaata ttttttcaat tgactggtgt 9926 
ccaaaaaata tagaaaattc aaattttcca agaaaaatag ccaaaaaaat gtaatttttg 9986 
tctaacaaaa aaattgaata gcgcaaaatt aaattgtcgt tttttttaat ttccctccgg 10046 
ttttgaaagg aaaaaattcc ataaaaatcg aaattttttg actgaaaaat ccatgaaaac 10106 
tcgaattttg agtcaaaaat cctctgaaaa tgctccaaaa tatgagattt tctgaaattt 10166 
catcaaaaat taagaatttc acggtttaaa aaaaattcca ttaaaatcga tatttttcaa 10226 
gtgaaaaatc tctggaaaac tcgatgtttg agtcaaaatt cgtctgaaaa tgctccttta 10286 
aattgaaaaa ttgaaaaaaa aaccgcccac aatatttgca g aat ate caa gtg ttc 10342 

Asn lie Gin Val Phe 
645 

gtc caa gtg tea tct ctt aaa ttc act gga atg aac ggt tac ccg gat 10390 
■ Val Gin Val Ser Ser Leu Lys Phe Thr Gly Met Asn Gly Tyr Pro Asp 
650 655 660 

cca gaa gat cgt ata tea att gac tgg gga tgc teg aaa ttg tgg cct 10438 
Pro Glu Asp Arg lie Ser lie Asp Trp Gly Cys Ser Lys Leu Trp Pro 
665 670 675 

tgt aag ccg aaa tct cat cac aaa ttc cgt gta cgc ttc cat caa gca 10486 
Cys Lys Pro Lys Ser His His Lys Phe Arg Val Arg Phe His Gin Ala 
680 685 690 

caa ctg ctg ccg aag aac gat cga att acg att gtg get gtg gcg aag 10534 
Gin Leu Leu Pro Lys Asn Asp Arg lie Thr lie Val Ala Val Ala Lys 
695 700 705 710 

gat aaa act age gga att att cac att teg cag gtgaaaaatt ggaaaatttg 10587 
Asp Lys Thr Ser Gly lie lie His lie Ser Gin 
715 720 

cacaaatcca gacaaaaaaa actgaaaaat cgaaaaaatt tttgtaattt tttgccgaaa 10647 

acgaaaatta aaaactgata aaaattgatt tttaaccgga aaatccctga aaaatcaaac 10707 

attttttget aaaaattgag aattataegg ttttgggtaa aaaaaaacta tttaaaaaaa 10767 

atattttttc tttaaaaatc tcaacaaaaa aaaaaccaat tttcattcag aaatcccccc 10827 

ggagaattgt caaaattttg ggaatactct gaaatttcga taaacacctc atttttgatt 10887 

aaaattgatt ttttaactga aaaatccctt aaaaaacgaa tattttagtt ttttcacaaa 10947 

aaaatgtgca atttatctga aatttcagca aaaaaaatga aaaaaaaaaa ttccgaaatt 11007 

aaaaactgat aaaaatcgat tttttacttg aaaaattcgt gaaaaatcaa acacattttt 11067 

gctaaccatt gagaatatta cgattttgtg aaaaaaaaaa ccattaaaat tgatttttta 11127 

ttcctaaaaa atgccagaaa aatcaatttt cagtcaaaaa teaceggaaa attatcaaaa 11187 

ttttgaggtt ttctgtgaaa tttcaagctg aaatttccat ttttgaataa aaaaaatgtg 11247 

gctggattta aaaaaaaacc attaaaattg attttttaac tgaaaaatcc gtatttctct 113 07 

gaaatttcag gcaaaaaatg tcatttccga aattaaaaat tgcgacaaaa tcaaataaaa 11367 

ttgatcaaat ttgcaaaaaa aaaaaaactt tegcaaaaaa tccttaaaat ttacattttt 11427 

caacaaaaac tcgaattttc agtcaaaaat tegtctgaaa atgctccaaa atatgggatt 11487 

ttttgaaatt ttagctaaaa attgagaatt geaeggtatt tagagaggga aaaattccat 11547 

aaaaatcgat attttcctct ttaaaatctc gaaaaaaatc atcaattttc attcaaaaat 11607 

cccccccgga aaattgtcaa aattttgaga tttttctgaa atttcacgea aaaattttca 11667 

ttttttcag ccc ace ttc ate act etc gaa tga tcgatctctt caegtcaaat 11720 
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Pro Thr Phe lie Thr Leu Glu 
725 



gcactttttt 
aaattgcttt 
attttcaaaa 
atttccccgt 
tgtaatgcca 
attcaattgg 
agaattttga 
tttaattcga 
atttgttcaa 
agaattttaa 
tgaaaaaaaa 
ttttttgact 
catcgtttcc 
ttttttccga 
aaacccaaaa 
gagggaaaaa 
ttcagtcaaa 



ctggattttt 
ttttgatttt 
aatctttttc 
ttttttctga 
aaatatgtgg 
tgcctctctc 
attttttcgt 
aaatttgtag 
aaatggcaaa 
aaaattagca 
aaaaaacaaa 
gcaaaatcgt 
aaaaaccaaa 
ttttttcaaa 
tttgagattt 
ctccattaaa 
aattaaattt 



ttgttaaaaa 
ttctgtaatt 
atctctttct 
taattttcaa 
taatttctcc 
aatgtgttgt 
cgtgattttt 
aaattcactt 
gttttcgaaa 
cagaaaaatg 
aaaaaaaaaa 
ctggaaatta 
taaaatgcca 
aaattccccc 
tctaaaattt 
attgatgatt 



atttgaaatt 
ttttttttgt 
ctctctctct 
tatttctctg 
ccattttttc 
atgaaaaaca 
attggttttc 
ttgtagctta 
ttttagtcta 
ccgaaaaatt 
aaaaagaggg 
acaaaattta 
aaaaaaaatt 
ttctaaaaaa 
tggcaaaaat 
ttatgactaa 



ctcgtgtttt 
tgattttctt 
gaatctcaat 
aatttttcta 
gctttattac 
ctgttttatg 
tttaccaatt 
aaaaattaaa 
aaaaaagatt 
cgtaattttt 
aaaaatccca 
aaaaaatctt 
tttatgcaaa 
aatggtgaat 
taagaatttc 
aaattcctaa 



ttcttctgaa 
aattttttta 
tttttcctga 
ttccccccgt 
tatttattct 
gaggttttgg 
caattttttt 
aattgagaaa 
tttttaatat 
catttaaaaa 
ttaaaagtag 
ttttacagcc 
aattctggat 
ttgttcccaa 
acggttttga 
aaaatcaatt 



<210> 29 
<211> 728 
<212> PRT 

<213> Caenorhabditis elegans 



<400> 29 
Met Ser Glu Val 
1 

Asp Pro lie Pro 
20 

Glu Asp Gly Glu 
35 

Asp Val Val Met 
50 

Leu Val lie Asp 
65 

Glu Ser Gin Glu 

Glu Ala Ser Ala 
100 

Gin Glu Ala Pro 
115 

Asp Ala Ser Ala 
130 

Val Pro Gly Ser 
145 

Leu Gly Ala Ser 

Ala Pro Glu Ala 
180 

Leu Ala Glu Gin 
195 

Asp He He He 
210 

Pro Pro He Asn 
225 

Glu Arg Leu Lys 



He Asp Glu Ser He Leu Asn Thr Glu Ala Ser Asp 

5 10 15 

Pro Leu Asn Asp Asp Gin He Ala Glu Leu Leu Gly 

25 30 
He Met Glu He Thr Glu Gin Lys Asp Glu Ser Asp 

40 45 
Leu Asp Asp Asp Asp Asp Asp Thr Pro Glu Pro He 

55 60 
Met Asp Glu Asp Glu Asp Val Thr Thr Asp Gly Pro 

70 75 80 

Glu Leu Ala Ala Asp Ala Pro Ala Pro Gly Ala Pro 
85 90 95 

Pro Ala Gin Glu Ala Ser Glu Ala Ser Ala Pro Asp 

105 110 
Glu Val Gin Asp Val Pro Asp Ser Ser Gly Ala Pro 

120 125 
Gin Ala Ser Glu Ala Ser Asp Ala Ser Ala Pro Glu 

135 140 
Thr Glu Ala Gin Asp Ala Gin Asp Val Pro Asp Ser 
150 155 160 

Asp Ala Ser Ala Gin Glu He Pro Glu Ala Pro Glu 
165 170 175 

Pro Glu He Ala Ala Glu He Asp Glu Glu Val Leu 

185 190 
Asn Gly Val Leu Asp Glu Gly Phe Asp Glu Thr Asp 

200 205 
Glu Glu Glu Ala Val Glu Glu Ala Glu Ala Val Glu 

215 220 
Thr Glu Asn Gin Glu Asn Ala Leu Glu Met Leu Glu 
230 235 240 

Lys Asn Glu Glu Lys Glu He Val Glu Lys Ser Asp 



11780 
11840 
11900 
11960 
12020 
12080 
12140 
12200 
12260 
12320 
12380 
12440 
12500 
12560 
12620 
12680 
12700 
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245 250 255 

Val Lys Pro Glu Asp Glu Asp lie He His Met Glu Thr Asp Ser Val 

260 265 270 

Glu Thr Ser Ser Arg Lys Arg Thr Gly Gly Ala Thr Ser Pro Arg Ser 

275 280 285 

Pro Ala Gin Lys Arg Pro Lys Arg Arg Val Gin Thr Leu Leu Lys Met 

290 295 300 

Arg Gin Asn Ala He Glu Leu Leu Thr Arg Leu Tyr Gly Ser Trp Asp 
305 310 315 320 

Ala Gin Leu Ser Leu Ser Asn Leu Glu Thr He Arg Leu Leu Gly Val 

325 330 335 

Asn Asn Asn Arg Lys Leu He Glu He Phe Glu Glu Asn Glu Gin Val 

340 345 350 

Leu Lys Gin Lys Val Ser Ala Leu Thr Glu Glu Leu Lys Lys Glu Lys 

355 360 365 

Leu Ala His Ala Gly Thr Arg Ser Ala Leu Lys Glu Leu Thr Asn Glu 

370 375 380 

He Thr Gly Met Arg Val Gin Met Asn Lys Leu Arg Ser Met Val Thr 
385 ^ 390 395 400 

Gin Pro Thr Thr Ser Lys He He Asp Ser Phe Val Gin Arg His Gin 

405 410 "415 

Ala Phe Glu Gin Gin Gin Gin Phe Gin His Gin His His Gin His Arg 

420 425 430 

Pro He Met Leu Ala Pro Arg His His Pro Pro Pro Pro Pro His Phe 

435 440 445 

Thr Pro Asn Gin Arg Ala Ala Ala Pro Tyr His Pro Asn Met Val Gin 

450 455 460 

Pro Asn Arg Leu Ala Ala Met Pro His Arg Arg Pro He He Gly Met 
465 ~ 470 475 480 

Gin Gin Gin Asn Ser Ala Pro Pro Gin Phe Asn Gly His Gin Ala Leu 

485 490 495 

Val Pro Ser Pro Gin Ser Ser Ser Ala Phe Ser Arg Pro Pro Pro Thr 

500 505 510 

Gin Leu Ala Thr Gin Arg Arg Ala Pro Pro Leu Ala Ser Thr Gly Leu 

515 520 525 

Pro Ala Thr Val Arg Trp Glu Ala He Pro Pro Pro Lys Asn Pro Asn 

530 535 540 

Val Gly His Asn Glu Pro Pro Leu Asn Asn Gly Gly Arg Ala Gin Pro 
545 " 550 555 560 

Leu He Asp Asn Thr Arg Val His Asp Asn Thr He Met Leu Cys Val 

565 570 575 

Pro Leu Val Ser Thr Ala Asn Thr He Ser Ser Gly Asp Ser Thr Arg 

580 585 590 

Leu Pro Lys Val Pro Arg He Tyr Glu Asn Leu Thr Ala Asn Pro Asp 

595 600 605 

Leu Ser Val Thr He His Ser Ser Ala Gin Asp Phe Arg Glu Asn Tyr 

610 615 620 

Gin He Gly Gly Lys He Asn Tyr Glu Tyr Leu Gly Gly Phe Asp Gin 
625 630 635 640 

Tyr Asn He Gin Val Phe Val Gin Val Ser Ser Leu Lys Phe Thr Gly 

645 650 655 

Met Asn Gly Tyr Pro Asp Pro Glu Asp Arg He Ser He Asp Trp Gly 

. 660 ' 665 ■ 670 • 

Cys Ser Lys Leu Trp Pro Cys Lys Pro Lys Ser His His Lys Phe Arg 

675 " 680 685 

Val Arg Phe His Gin Ala Gin Leu Leu Pro Lys Asn Asp Arg He Thr 

690 695 700 

He Val Ala Val Ala Lys Asp Lys Thr Ser Gly He He His He Ser 
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705 710 
Gin Pro Thr Phe He Thr Leu Glu 
725 



715 



720 



<210> 30 
<211> H 
<212> DNA 

<213> Caenorhabditis elegans 
<400> 30 

aagatatgtg t 11 

<210> 31 
<211> 11 
<212> DNA 

<213> Caenorhabditis elegans 
<400> 31 

aacttcaaaa t 11 

<210> 32 
<211> 11 
<212> DNA 

<213> Caenorhabditis elegans 
<400> 32 

cttataagtt t 11 

<210> 33 
<211> 11 
<212> DNA 

<213> Caenorhabditis elegans 
<400> 33 

ttttccaaaa a 11 

<210> 34 
<211> 11 
<212> DNA 

<213> Caenorhabditis elegans 
<400> 34 

ttttttaaga t 11 

<210> 35 
<211> 6403 
<212> DNA 

<213> Caenorhabditis elegans 
<400> 35 

aaggaattag actctttatc taaagtgaag aatgatcaat taagaagttt ttgtcccata 60 
gaattaaata taaatggatc tcctggggca gaatctgatt tggcaacatt ttgcacttct 120 
aaaactgatg ctgttttaat gacttctgat gatagtgtga ctggatcgga attatcccct 180 
ttggtcaaag catgcatgct ttcatcaaat ggatttcaga atattagtag gtgcaaagaa 240 
aaagacttgg atgatacctg catgctgcat aagaagtcag aaagcccatt tagagaaaca 300 
gaacctctgg tgtcaccaca ccaagataaa ctcatgtcta tgccagttat gactgtggat 360 
tattccaaaa cagtagttaa agaaccagtt gatacgaggg tttcttgctg caaaaccaaa 420 
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gattcagaca tatactgtac tttgaacgat agcaaccctt ctttgtgtaa ctctgaagct 480 
gaaaatattg agccttcagt tatgaagatt tcttcaaata gctttatgaa tgtgcatttg 540 
gaatcaaaac cagttatatg tgatagtaga aatttgacag atcactcaaa atttgcatgt 600 
gaagaatata agcagagcat cggtagcact agttcagctt ctgttaatca ttttgatgat 660 
ttatatcaac ctattgggag ttcaggtatt gcttcatctc ttcagagtct tccaccagga 720 
ataaaggtgg acagtctaac tctcttgaaa tgcggagaga acacatctcc agttctggat 780 
gcagtgctaa agagtaaaaa aagttcagag tttttaaagc atgcagggaa agaaacaata 840 
gtagaagtag gtagtgacct tcctgattca ggaaagggat ttgcttccag ggagaacagg 900 
cgtaataatg ggttatctgg gaaatgtttg caagaggctc aagaagaagg gaattccata 960 
ttgcctgaaa gaagaggaag accagaaatc tctttagatg aaagaggaga aggaggacat 1020 
gtgcatactt ctgatgactc agaagttgta ttttcttctt gtgatttgaa tttaaccatg 1080 
gaagacagtg atggtgtaac ttatgcatta aagtgtgaca gtagtggtca tgccccagaa 1140 
attgtgtcta cagttcatga agattattct ggctcttctg aaagttcaaa- tgatgaaagt 1200 
gattcagaag atacagattc ggatgatagc agtattccaa gaaaccgtct ccagtctgtt 1260 
gtggttgtgc caaagaattc tactttgccc atggaagaaa caagtccttg ttcttctcgg 1320 
agcagtcaaa gttatagaca ctattctgac cattgggaag atgagagatt ggagtcaagg 1380 
agacatttgt atgaggaaaa atttgaaagt atagcaagta aagcctgtcc tcaaactgat 1440 
aagtttttcc ttcataaagg aacagagaag aatccggaaa tttcttttac acagtccagt 1500 
agaaaacaaa tagataaccg cctgcctgaa ctttctcatc ctcagagtga tggggttgat 1560 
agtacaagtc atacagatgt gaaatctgac cctctgggtc acccaaattc agaggaaacc 1620 
gtgaaagcca aaataccttc taggcagcaa gaagagctgc caatttattc ttctgatttt 1680 
gaagatgtcc caaataagtc ttggcaacag accactttcc aaaacaggcc agatagtaga 1740 
ctgggaaaaa cagaattgag tttttcttcc tcttgtgaga taccacatgt ggatggcttg 1800 
cactcatcag aagagctcag aaacttaggt tgggacttct ctcaagaaaa gccttctacc 1860 
acgtatcagc aacctgacag tagctatgga gcttgtggtg gacacaagta tcagcaaaat 1920 
gcagaacagt atggtgggac acgtgattac tggcaaggca atggttactg ggatccaaga 1980 
tcaggtagac ctcctggaac tggggttgtg tatgatcgaa ctcaaggaca agtaccagat 2040 
tccctaacag atgatcgtga agaagaggag aattgggatc aacaggatgg atcccatttt 2100 
tcagaccagt ccgataaatt tcttctatcc cttcagaaag acaaggggtc agtgcaagca 2160 
cctgaaataa gcagcaattc cattaaggac actttagctg tgaatgaaaa gaaagatttt 2220 
tcaaaaaact tagaaaaaaa tgatatcaaa gatagagggc ctcttaaaaa aaggaggcag 2280 
gaaatagaga gtgattctga aagtgatggt gagcttcagg acagaaagaa agttagagtg 2340 
gaggtagagc agggagagac atcagtgccc ccaggttcag cactggttgg gccctcctgt 2400 
gtcatggatg acttcaggga cccacagcga tggaaggaat gtgccaagca agggaaaatg 2460 
ccatgttact ttgatcttat tgaagaaaat gtttatttaa cagaaagaaa gaagaataaa 2520 
tctcatcgag atattaagcg aatgcagtgt gagtgtacac ctctttctaa agatgaaaga 2580 
gctcaaggtg aaatagcatg tggggaagat tgtcttaatc gtcttctcat gattgaatgt 2640 
tcttctcggt gtccaaatgg ggattattgt tccaatagac ggtttcagag aaaacagcat 2700 
gcagatgtgg aagtcatact cacagaaaag aaaggctggg gcttgagagc tgccaaagac 2760 
cttccttcga acacctttgt cctagaatat tgtggagagg tactcgatca taaagagttt 2820 
aaagctcgag tgaaggagta tgcacgaaac aaaaacatcc attactattt catggccctg 2880 
aagaatgatg agataataga tgccactcaa aaaggaaatt gctctcgttt catgaatcac 2940 
agctgtgaac caaattgtga aacccaaaaa tggactgtga acggacaact gagggttggg 3000 
ttttttacca ccaaactggt tccttcaggc tcagagttaa cgtttgacta tcagttccag 3060 
agatatggaa aagaagccca gaaatgtttc tgcggatcag ccaattgccg gggttacctg 3120 
ggaggagaaa acagagtcag catcagagca gcaggaggga aaatgaagaa ggaacgatct 3180 
cgtaagaagg attcagtgga tggagagcta gaagctctga tggaaaatgg tgagggtctc 3240 
tctgataaaa accaggtgct cagcttatcc cggctaatgg ttagaattga aactttggag 33 00 
cagaaactta cctgtctgga actcatacag aacacacact cacagtcctg cctgaagtcc 3360 
tttctggaac gtcatgggct gtctttgttg tggatctgga tggcagagct aggtgacggc 3420 
cgggaaagta accagaagct tcaggaagag attataaaga ctttggaaca cttgcccatt 3480 
cctactaaaa atatgttgga ggaaagcaaa gtacttccaa ttattcaacg ctggtctcag 3540 
actaagactg ctgtccctcc gttgagtgaa ggagatgggt attctagtga gaatacatcg 3600 
cgtgctcata caccactcaa cacacctgat ccttccacca agctgagcac agaagctgac 3660 
acagacactc ccaagaaact aatgtttcgc agactgaaaa ttataagtga aaatagcatg 3720 
gacagtgcaa tctctgatgc aaccagtgag ctagaaggca aggatggcaa agaggatctt 3780 
gatcaattag aaaatgtccc tgtagaggaa gaggaagaat tgcagtcaca acagctactc 3840 
ccacaacagc tgcctgaatg caaagttgat agtgaaacca acatagaagc tagtaagcta 3900 
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cctacatctg aaccagaagc tgacgctgaa atagagctca aagagagcaa cggcacaaaa 3960 
ctagaagaac ctattaatga agaaacacca tcccaagatg aagaggaggg tgtgtctgat 4020 
gtggagagtg aaaggagcca agaacagcca gataaaacag tggatataag tgatttggcc 4080 
accaaactcc tggacagttg gaaagaccta aaggaggtat atcgaattcc aaagaaaagt 4140 
caaactgaaa aggaaaacac aacaactgaa cgaggaaggg atgctgttgg cttcagagat 4200 
caaacacctg ccccgaagac tcctaatagg tcaagagaga gagacccaga caagcaaact 4260 
caaaataaag agaaaaggaa acgaagaagc tccctctcac caccctcttc tgcctatgag 4320 
cggggaacaa aaaggccaga tgacagatat gatacaccaa cttctaaaaa gaaagtacga 4380 
attaaagacc gcaataaact ttctacagag gaacgccgga agttgtttga gcaagaggtg 4440 
gctcaacggg aggctcagaa acaacagcaa cagatgcaga acctgggaat gacatcacca 4500 
ctgccctatg actctcttgg ttataatgcc ccgcatcatc cctttgctgg ttacccacca 4560 
ggttatccca tgcaggccta tgtggatccc agcaacccta atgctggaaa ggtgctcctg 4620 
cccacaccca gcatggaccc agtgtgttct cctgctcctt atgatcatgc tcagcccttg 4680 
gtgggacatt ctacagaacc cctttctgcc cctccaccag taccagtggt gccacatgtg 4740 
gcagctcctg tggaagtttc cagttcccag tatgtggccc agagtgatgg tgtagtacac 4800 
caagactcca gcgttgctgt cttgccagtg ccggcccccg gcccagttca gggacagaat 4860 
tatagtgttt gggattcaaa ccaacagtct gtcagtgtac agcagcagta ctctcctgca 4920 
cagtctcaag caaccatata ttatcaagga cagacatgtc caacagtcta tggtgtgaca 4980 
tcaccttatt cacagacaac tccaccaatt gtacagagtt atgcccagcc aagtcttcag 5040 
tatatccagg ggcaacagat tttcacagct catccacaag gagtggtggt acagccagcc 5100 
gcagcagtga ctacaatagt tgcaccaggg cagcctcagc ccttgcagcc atctgaaatg 5160 
gttgtgacaa ataatctctt ggatctgccg. cccccctctc ctcccaaacc aaaaaccatt 5220 
gtcttacctc ccaactggaa gacagctcga gatccagaag ggaagattta ttactaccat 5280 
gtgatcacaa ggcagactca gtgggatcct cctacttggg aaagcccagg agatgatgcc 5340 
agccttgagc atgaagctga gatggacctg ggaactccaa catatgatga aaaccccatg 5400 
aaggcctcga aaaagcccaa gacagcagaa gcagacacct ccagtgaact agcaaagaaa 5460 
agcaaagaag tattcagaaa agagatgtcc cagttcatcg tccagtgcct gaacccttac 5520 
cggaaacctg actgcaaagt gggaagaatt accacaactg aagactttaa acatctggct 5580 
cgcaagctga ctcacggtgt tatgaataag gagctgaagt actgtaagaa tcctgaggac 5640 
ctggagtgca atgagaatgt gaaacacaaa accaaggagt acattaagaa gtacatgcag 5700 
aagtttgggg ctgtttacaa acccaaagag gacactgaat tagagtgact gttgggccag 5760 
ggtgggagga tgggtggtca ggtaagacag actctaggga gaggaaatcc tgtgggcctt 5820 
tctgtcccac ccctgtcagc actgtgctac tgatgataca tcaccctggg gaattcaacc 5880 
ctgcagatgt caactgaagg ccacaaaaat gaactccatc tacaagtgat tacctagttg 5940 
tgagctgttg gcatgtggtt agaagccatc agaggtgcaa gggcttagaa aagaccctgg 6000 
ccagacctga ctccactctt aaacctgggt cttctccttg gcggtgctgt cagcgcacag 6060 
acccatgcgc atccccaccc acaacccttt accctgatga tctgtattat attttaatgt 6120 
atatgtgaat atattgaaaa taatttgttt tttcctggtt tttgtttggt tttcgttttg 6180 
cttttagcct ctacatgcta ggatcacagg aagactttgt aaggacagtt taagttctcc 6240 
tgcaaggttt aatttgttat catgtaaata ttccaaagca ggctgccttg tggttttggc 6300 
* cagccttgtg ctatgttgat aagattgatt tactgcttaa aatcacttta ctttatccaa 6360 
tttttactga actttttatg taaaaaaata aaatcaatta aag 6403 



<210> 36 
<211> 1915 
<212> PRT 

<213> Caenorhabditis elegans 
<400> 36 

Lys- Glu Leu Asp Ser LetuSer Lys 

-1 * 5 " 

Phe Cys Pro He Glu Leu Asn He 
20 

Asp Leu Ala Thr Phe Cys Thr Ser 

35 40 
Ser Asp Asp Ser Val Thr Gly Ser 



Val Lys Asn Asp Gin Leu Arg Ser 

'10-- 15 
Asn Gly Ser Pro Gly Ala Glu Ser 
25 30 
Lys Thr Asp Ala Val Leu Met Thr 
45 

Glu Leu Ser Pro Leu Val Lys Ala 
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50 55 60 

Cys Met Leu Ser Ser Asn Gly Phe Gin Asn He Ser Arg Cys Lys Glu 
65 70 75 80 

Lys Asp Leu Asp Asp Thr Cys Met Leu His Lys Lys Ser Glu Ser Pro 

85 90 95 

Phe Arg Glu Thr Glu Pro Leu Val Ser Pro His Gin Asp Lys Leu Met 

100 105 110 

Ser Met Pro Val Met Thr Val Asp Tyr Ser Lys Thr Val Val Lys Glu 

115 120 125 

Pro Val Asp Thr Arg Val Ser Cys Cys Lys Thr Lys Asp Ser Asp He 

13 0 135 140 

Tyr Cys Thr Leu Asn Asp Ser Asn Pro Ser Leu Cys Asn Ser Glu Ala 
145 150 155 160 

Glu Asn He Glu Pro Ser Val Met Lys He Ser Ser Asn Ser Phe Met 

165 170 175 

Asn Val His Leu Glu Ser Lys Pro Val He Cys Asp Ser Arg Asn Leu 

180 " 185 190 

Thr Asp His Ser Lys Phe Ala Cys Glu Glu Tyr Lys Gin Ser He Gly 

195 200 205 

Ser Thr Ser Ser Ala Ser Val Asn His Phe Asp Asp Leu Tyr Gin Pro 

210 215 220 

He Gly Ser Ser Gly He Ala Ser Ser Leu Gin Ser Leu Pro Pro Gly 
225 230 235 > 240 

He Lys Val Asp Ser Leu Thr Leu Leu Lys Cys Gly Glu Asn Thr Ser 

245 250 255 

Pro Val Leu Asp Ala Val Leu Lys Ser Lys Lys Ser Ser Glu Phe Leu 

260 265 270 

Lys His Ala Gly Lys Glu Thr He Val Glu Val Gly Ser Asp Leu Pro 

275 * 280 285 

Asp Ser Gly Lys Gly Phe Ala Ser Arg Glu Asn Arg Arg Asn Asn Gly 

290 295 300 

Leu Ser Gly Lys Cys Leu Gin Glu Ala Gin Glu Glu Gly Asn, Ser He 
305 A 310 315 320 

Leu Pro Glu Arg Arg Gly Arg Pro Glu He Ser Leu Asp Glu Arg Gly 

325 330 335 

Glu Gly Gly His Val His Thr Ser Asp Asp Ser Glu Val Val Phe Ser 

340 345 350 

Ser Cys Asp Leu Asn Leu Thr Met Glu Asp Ser Asp Gly Val Thr Tyr 

355 360 365 

Ala Leu Lys Cys Asp Ser Ser Gly His Ala Pro Glu He Val Ser Thr 

370 " 375 380 

Val His Glu Asp Tyr Ser Gly Ser Ser Glu Ser Ser Asn Asp Glu Ser 
385 ^390 395 400 

Asp Ser Glu Asp Thr Asp Ser Asp Asp Ser Ser He Pro Arg Asn Arg 

405 410 415 

Leu Gin Ser Val Val Val Val Pro Lys Asn Ser Thr Leu Pro Met Glu 

420 425 430 

Glu Thr Ser Pro Cys Ser Ser Arg Ser Ser Gin Ser Tyr Arg His Tyr 

435 440 445 

Ser Asp His Trp Glu Asp Glu Arg Leu Glu Ser Arg Arg His Leu Tyr 

450 455 460 

Glu Glu Lys Phe Glu Ser He Ala Ser Lys Ala Cys Pro Gin Thr Asp 
-465 470 - 475 ■ 480 

Lys Phe Phe Leu His Lys Gly Thr Glu Lys Asn Pro Glu He Ser Phe 

485 490 495 

Thr Gin Ser Ser Arg Lys Gin He Asp Asn Arg Leu Pro Glu Leu Ser 

500 ~ 505 510 

His Pro Gin Ser Asp Gly Val Asp Ser Thr Ser His Thr Asp Val Lys 
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Ser Asp 
530 
lie Pro 
545 

Glu Asp 

Pro Asp 

Glu He 

Leu Gly 
610 
Pro Asp 
625 

Ala Glu 

Trp Asp 

Arg. Thr 

Glu Glu 
690 
Asp Lys 
705 

Pro Glu 

Lys Lys 

Gly Pro 

Asp Gly 
770 
Gly Glu 
785 

Val Met 

Gin Gly 

Leu Thr 

Gin Cys 
850 
He Ala 
. 865 
Ser Ser 



515 

Pro Leu 

Ser Arg 

Val Pro 

Ser Arg 
580 
Pro His 
595 

Trp Asp 

Ser Ser 

Gin Tyr 

Pro Arg 
660 
Gin Gly 
675 

Asn Trp 

Phe Leu 

He Ser 

Asp Phe 
740 
Leu Lys 
755 

Glu Leu 

Thr Ser 

Asp Asp 

Lys Met 
820 
Glu Arg 
835 

Glu Cys 
Cys Gly 
Arg Cys 



Gly His 

Gin Gin 
550 
Asn Lys 
565 

Leu Gly 



520 
Pro Asn Ser 
535 

Glu Glu Leu 



Glu Glu 



Pro He 
555 

Ser Trp Gin Gin Thr 
570 

Leu Ser 



Arg Lys 
Trp Gly 



Glu, Tyr 
930 
Lys Glu 
945 

Lys Asn 



Gin His 
900 
Leu Arg 
915 

Cys Gly 



Val Asp 

Phe Ser 

Tyr Gly 
630 
Gly Gly 
645 

Ser Gly 

Gin Val 

Asp Gin 

Leu Ser 
710 
Ser Asn 
725 

Ser Lys 

Lys Arg 

Gin Asp 

Val Pro 
790 
Phe Arg 
805 

Pro Cys 

Lys Lys 

Thr Pro 

Glu Asp 
870 
Pro Asn 
885 

Ala Asp 



Lys Thr Glu 
585 

Gly Leu His 
600 

Gin Glu Lys 
615 

Ala Cys Gly 

Thr Arg Asp 

Arg Pro Pro 
665 

Pro Asp Ser 

680 
Gin Asp Gly 
695 

Leu Gin Lys 

Ser He Lys 

Asn Leu Glu 
745 

Arg Gin Glu 

760 
Arg Lys Lys 
775 1 
Pro Gly Ser 



525 
Thr Val 
540 

Tyr Ser 
Thr Phe 
Phe Ser 



Ser Ser 

Pro Ser 

Gly His 
635 
Tyr Trp 
650 

Gly Thr 

Leu Thr 

Ser His 

Asp Lys 
715 
Asp Thr 
730 

Lys Asn 
He Glu 
Val Arg 



Lys Ala Lys 

Ser Asp Phe 
560 

Gin Asn Arg 

575 
Ser Ser Cys 
590 

Leu Arg Asn 



Glu Glu 
605 

Thr Thr Tyr Gin Gin 
620 

Lys Tyr 



Gin Gly 
Gly Val 



Gin Gin Asn 
640 

Asn Gly Tyr 
655 

Val Tyr Asp 
670 

Arg Glu Glu 



Asp Asp 
685 

Phe Ser Asp Gin Ser 
700 

Gly Ser 



Leu Ala 
Asp He 



Val Gin Ala 
720 

Val Asn Glu 

735 
Lys Asp Arg 
750 

Ser Glu Ser 



Ala Leu 
795 

Asp Pro Gin Arg Trp 
810 

Leu He 



Ser Asp 
765 

Val Glu Val Glu Gin 
780 

Val Gly 



Tyr Ala 
Asp Glu 
Phe Met Asn His 



Ala Ala 

Glu Val 

Arg Asn 
950 
He He 
965 

Ser Cys 



Tyr Phe Asp 
825 

Asn Lys Ser 

840 
Leu Ser Lys 
855 

Cys Leu Asn 

Gly Asp Tyr 

Val Glu Val 
905 

Lys Asp Leu 
920 

_Leu Asp His 
935 

Lys Asn He 
Asp Ala Thr 
Glu Pro Asn 



Lys Glu 
Glu Glu 



His Arg 

Asp Glu 

Arg Leu 
875 
Cys Ser 
890 

He Leu 

Pro Ser 

Lys Glu 

His Tyr 
955 
Gin Lys 
970 

Cys Glu 



Pro Ser Cys 
800 
Cys Ala Lys 
815 

Asn Val Tyr 
830 

Lys Arg Met 



Asp He 
845 

Arg Ala Gin Gly Glu 
860 

Leu Met 



Asn Arg 
Thr Glu 



He Glu Cys 
880 

Arg Phe Gin 
895 

Lys Lys Gly 
910 

Phe Val Leu 



Asn Thr 
925 

Phe Lys Ala Arg Val 
940 

Tyr Phe 



Gly Asn 
Thr Gin 



Met Ala Leu 
960 

Cys Ser Arg 

975 
Lys Trp Thr 
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515 




Ser Asp 


Pro 


Leu 


530 






lie Pro 


Ser 


Arg 


545 






Glu Asp 


Val 


Pro 


Pro Asp 


Ser 


Arg 






580 


Glu He 


Pro 


His 




595 




Leu Gly 


Trp 


Asp 


610 






Pro Asp 


Ser 


Ser 


625 






Ala Glu 


Gin 


Tyr 


Trp Asp 


Pro 


Arg 






660 


Arg, Thr 


Gin 


Gly 




675 




Glu Glu 


Asn 


Trp 


690 






Asp Lys 


Phe 


Leu 


705 






Pro Glu 


He 


Ser 


Lys Lys 


Asp 


Phe 






740 


Gly Pro 


Leu 


Lys 




755 




Asp Gly 


Glu 


Leu 


770 






Gly Glu 


Thr 


Ser 


785 






Val Met 


Asp 


Asp 


Gin Gly 


Lys 


Met 






820 


Leu Thr 


Glu 


Arg 




835 




Gin Cys 


Glu 


Cys 


850 






He Ala 


Cys 


Gly 


865 






Ser Ser 


Arg 


Cys 


Arg Lys 


Gin 


His 






900 


Trp Gly 


Leu 


Arg 




915 




Glu .Tyr 


Cys 


Gly 


930 






Lys Glu 


Tyr 


Ala 


945 






Lys Asn 


Asp 


Glu 


Phe Met 


Asn 


His 









520 


Gly 


His 


Pro 


Asn 




535 




Gin 


Gin 


Glu 


Glu 




550 






Asn 


Lys 


Ser 


Trp 


565 








Leu 


Gly 


Lys 


Thr 


Val 


Asp 


Gly 


Leu 








600 


Phe 


Ser 


Gin 


Glu 






615 




Tyr 


Gly 


Ala 


Cys 




630 






Gly 


Gly 


Thr 


Arg 


645 








Ser 


Gly 


Arg 


Pro 


Gin 


Val 


Pro 


Asp 








680 


Asp 


Gin 


Gin 


Asp 






695 




Leu 


Ser 


Leu 


Gin 




710 






Ser 


Asn 


Ser 


He 


725 








Ser 


Lys 


Asn 


Leu 


Lys 


Arg 


Arg 


Gin 








760 


Gin 


Asp 


Arg 


Lys 






775 




Val 


Pro 


Pro 


Gly 




790 






Phe 


Arg 


Asp 


Pro 


805 








Pro 


Cys 


Tyr 


Phe 


Lys 


Lys 


Asn 


Lys 








840 


Thr 


Pro 


Leu 


Ser 






855 




Glu 


Asp 


Cys 


Leu 




870 






Pro 


Asn 


Gly 


Asp 


885 








Ala 


Asp 


Val 


Glu 


Ala 


Ala 


Lys 


Asp 








920 


Glu 


Val. 


JLeu 


Asp 






935 




Arg 


Asn 


Lys 


Asn 




950 






He 


He 


Asp 


Ala 


965 








Ser 


Cys 


Glu 


Pro 



Ser Glu 


Glu 


Thr 






540 


Leu Pro 


He 


Tyr 




555 




Gin Gin 


Thr 


Thr 


D 1 U 






Glu Leu 


Ser 


Phe 


585 






His Ser 


Ser 


Glu 


Lys Pro 


Ser 


Thr 






620 


Gly Gly 


His 


Lys 




635 




Asp Tyr 


Trp 


Gin 


ODU 






Pro Gly 


Thr 


Gly 


665 






Ser Leu 


Thr 


Asp 


Gly Ser 


His 


Phe 






700 


Lys Asp 


Lys 


Gly 




715 




Lys Asp 


Thr 


Leu 


730 






Glu Lys 


Asn 


Asp 


745 






Glu He 


Glu 


Ser 


Lys Val 


Arg 


Val 






780 


Ser Ala 


Leu 


Val 




795 




Gin Arg 


Trp 


Lys 


olU 






Asp Leu 


He 


Glu 


825 






Ser His 


Arg 


Asp 


Lys Asp 


Glu 


Arg 






860 


Asn Arg 


Leu 


Leu 




875 




Tyr Cys 


Ser 


Asn 


890 






Val He 


Leu 


Thr 


905 






Leu Pro 


Ser 


Asn 


His Lys 


Glu 


Phe 






940 


He His 


Tyr 


Tyr 




955 




Thr Gin 


Lys 


Gly 


970 






Asn Cys 


Glu 


Thr 
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525 






Val 


Lys 


Ala Lys 


Ser 


Ser 


Asp Phe 






560 


Phe 


Gin 


Asn Arg 






575 


Ser 


Ser 


Ser Cys 




590 




G1U 


Leu 


Arg Asn 


605 






Thr 


Tyr 


Gin Gin 


Tyr 


Gin 


Gin Asn 






640 


Gly 


Asn 


Gly Tyr 






655 


Val 


Val 


Tyr Asp 




670 




Asp 


Arg 


r«i ii pin 


685 






Ser 


Asp 


Gin Ser 


Ser 


Val 


Gin Ala 






720 


Ala 


Val 


Asn Glu 






735 


He 


Lys 


Asp Arg 




750 




Asp 


Ser 


Glu Ser 


765 






Glu 


Val 


Glu Gin 


Gly 


Pro 


Ser Cys 






800 


Glu 


Cys 


Ala Lys 






815 


Glu 


Asn 


Val Tyr 




830 




lie 


T « rn 

Lys 


Arg Met 


845 






Ala 


Gin 


Gly Glu 


Met 


He 


Glu Cys 






880 


Arg 


Arg 


Phe Gin 






895 


Glu 


Lys 


Lys Gly 




910 




inr 


rue 


vai Leu 


925 






Lys 


Ala 


Arg Val 


Phe 


Met 


Ala Leu 






960 


Asn 


Cys 


Ser Arg 






975 


Gin 


Lys 


Trp Thr 
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980 985 990 

Val Asn Gly Gin Leu Arg Val Gly Phe Phe Thr Thr Lys Leu Val Pro 

995 1000 1005 

Ser Gly Ser Glu Leu Thr Phe Asp Tyr Gin Phe Gin Arg Tyr Gly Lys 

1010 1015 1020 

Glu Ala Gin Lys Cys Phe Cys Gly Ser Ala Asn Cys Arg Gly Tyr Leu 
1025 ' 1030 1035 1040 

Gly Gly Glu Asn Arg Val Ser lie Arg Ala Ala Gly Gly Lys Met Lys 

1045 1050 1055 

Lys Glu Arg Ser Arg Lys Lys Asp Ser Val Asp Gly Glu Leu Glu Ala 

1060 ** 1065 1070 

Leu Met Glu Asn Gly Glu Gly Leu Ser Asp Lys Asn Gin Val Leu Ser 

1075 1080 1085 

Leu Ser Arg Leu Met Val Arg lie Glu Thr Leu Glu Gin Lys Leu Thr 

1090 1095 1100 

Cys Leu Glu Leu lie Gin Asn Thr His Ser Gin Ser Cys Leu Lys Ser 
1105 1110 1115 1120 

Phe Leu Glu Arg His Gly Leu Ser Leu Leu Trp lie Trp Met Ala Glu 

1125 1130 1135 

Leu Gly Asp Gly Arg Glu Ser Asn Gin Lys Leu Gin Glu Glu lie lie 

1140 1145 1150 

Lys Thr Leu Glu His Leu Pro lie Pro Thr Lys Asn Met Leu Glu Glu 

1155 1160 1165 

Ser Lys Val Leu Pro lie lie Gin Arg Trp Ser Gin Thr Lys Thr Ala 

1170 1175 1180 

Val Pro Pro Leu Ser Glu Gly Asp Gly Tyr Ser Ser Glu Asn Thr Ser 
1185 1190 1195 1200 

Arg Ala His Thr Pro Leu Asn Thr Pro Asp Pro Ser Thr Lys Leu Ser 

1205 1210 1215 

Thr Glu Ala Asp Thr Asp Thr Pro Lys Lys Leu Met Phe Arg Arg Leu 

1220 1225 1230 

Lys lie lie Ser Glu Asn Ser Met Asp Ser Ala lie Ser Asp Ala Thr 

1235 1240 1245 

Ser Glu Leu Glu Gly Lys Asp Gly Lys Glu Asp Leu Asp Gin Leu Glu 

1250 * 1255 1260 

Asn Val Pro Val Glu Glu Glu Glu Glu Leu Gin Ser Gin Gin Leu Leu 
1265 1270 1275 1280 

Pro Gin Gin Leu Pro Glu Cys Lys Val Asp Ser Glu Thr Asn lie Glu 

1285 1290 1295 

Ala Ser Lys Leu Pro Thr Ser Glu Pro Glu Ala Asp Ala Glu lie Glu- 

1300 1305 1310 

Leu Lys Glu Ser Asn Gly Thr Lys Leu Glu Glu Pro lie Asn Glu Glu 

1315 J 1320 1325 

Thr Pro Ser Gin Asp Glu Glu Glu Gly Val Ser Asp Val Glu Ser Glu 

1330 1335 1340 

Arg Ser Gin Glu Gin Pro Asp Lys Thr Val Asp lie Ser Asp Leu Ala 
1345 1350 " 1355 1360 

Thr Lys Leu Leu Asp Ser Trp Lys Asp Leu Lys Glu Val Tyr Arg lie 

1365 1370 1375 

Pro Lys Lys Ser Gin Thr Glu Lys Glu Asn Thr Thr Thr Glu Arg Gly 

1380 1385 1390 

Arg Asp Ala Val Gly Phe Arg Asp Gin Thr Pro Ala Pro Lys Thr Pro 

1395 1400 1405 

Asn Arg Ser Arg Glu Arg Asp Pro Asp Lys Gin Thr Gin Asn Lys Glu 

1410 1415 1420 

Lys Arg Lys Arg Arg Ser Ser Leu Ser Pro Pro Ser Ser Ala Tyr Glu 
1425 1430 1435 1440 

Arg Gly Thr Lys Arg Pro Asp Asp Arg Tyr Asp Thr Pro Thr Ser Lys 
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1445 1450 1455 

Lys Lys Val Arg lie Lys Asp Arg Asn Lys Leu Ser Thr Glu Glu Arg 

1460 ~ 1465 1470 

Arg Lys Leu Phe Glu Gin Glu Val Ala Gin Arg Glu Ala Gin Lys Gin 

1475 1480 1485 

Gin Gin Gin Met Gin Asn Leu Gly Met Thr Ser Pro Leu Pro Tyr Asp 

1490 1495 1500 

Ser Leu Gly Tyr Asn Ala Pro His His Pro Phe Ala Gly Tyr Pro Pro 
1505 1510 1515 1520 

Gly Tyr Pro Met Gin Ala Tyr Val Asp Pro Ser Asn Pro Asn Ala Gly 

1525 1530 1535 

Lys Val Leu Leu Pro Thr Pro Ser Met Asp Pro Val Cys Ser Pro Ala 

1540 1545 1550 

Pro Tyr Asp His Ala Gin Pro Leu Val Gly His Ser Thr Glu Pro Leu 

1555 1560 1565 - 

Ser Ala Pro Pro Pro Val Pro Val Val Pro His Val Ala Ala Pro Val 

1570 1575 1580 

Glu Val Ser Ser Ser Gin Tyr Val Ala Gin Ser Asp Gly Val Val His 
1585 1590 1595 1600 

Gin Asp Ser Ser Val Ala Val Leu Pro Val Pro Ala Pro Gly Pro Val 

1605 1610 1615 

Gin Gly Gin Asn Tyr Ser Val Trp Asp Ser Asn Gin Gin Ser Val Ser 

1620 1625 1630 

Val Gin Gin Gin Tyr Ser' Pro Ala Gin Ser Gin Ala Thr lie Tyr Tyr 

1635 1640 1645 

Gin Gly Gin Thr Cys Pro Thr Val Tyr Gly Val Thr Ser Pro Tyr Ser 

1650 1655 1660 

Gin Thr Thr Pro Pro He Val Gin Ser Tyr Ala Gin Pro Ser Leu Gin 
1665 1670 1675 1680 

Tyr He Gin Gly Gin Gin He Phe Thr Ala His Pro Gin Gly Val Val 

1685 1690 1695 

Val Gin Pro Ala Ala Ala Val Thr Thr He Val Ala Pro Gly Gin Pro 

1700 1705 1710 

Gin Pro Leu Gin Pro Ser Glu Met Val Val Thr Asn Asn Leu Leu Asp 

1715 1720 1725 

Leu Pro Pro Pro Ser Pro Pro Lys Pro Lys Thr He Val Leu Pro Pro 

1730 1735 "* 1740 

Asn Trp Lys Thr Ala Arg Asp Pro Glu Gly Lys He Tyr Tyr Tyr His 
1745 1750 " 1755 " 1760 

Val lie Thr Arg Gin Thr Gin Trp Asp Pro Pro Thr Trp Glu Ser Pro 

1765 1770 1775 

Gly Asp Asp Ala Ser Leu Glu His Glu Ala Glu Met Asp Leu Gly Thr 

1780 1785 ^ 1790 

Pro Thr Tyr Asp Glu Asn Pro Met Lys Ala Ser Lys Lys Pro Lys Thr 

1795 1800 1805 

Ala Glu Ala Asp Thr Ser Ser Glu Leu Ala Lys Lys Ser Lys Glu Val 

1810 " 1815 1B20 

Phe Arg Lys Glu Met Ser Gin Phe He Val Gin Cys Leu Asn Pro Tyr 
1825 1830 1835 1840 

Arg Lys Pro Asp Cys Lys Val Gly Arg He Thr Thr Thr Glu Asp Phe 

1845 1850 1855 

Lys His Leu Ala Arg Lys Leu Thr His Gly Val Met Asn Lys Glu Leu 

1860 1865 1870 

Lys Tyr Cys Lys Asn Pro Glu Asp Leu Glu Cys Asn Glu Asn Val Lys 

1875 1880 1885 

His Lys Thr Lys Glu Tyr He Lys Lys Tyr Met Gin Lys Phe Gly Ala 

1890 1895 1900 

Val Tyr Lys Pro Lys Glu Asp Thr Glu Leu Glu 
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